Python Foundations
X1: The Spotify API¶

Instructor: Wesley Beckner

For this workbook we're going to use the spotipy library to access the Spotify Web API!

Install and import libraries¶

First we will need to install it:

!pip install spotipy

Collecting spotipy
  Downloading spotipy-2.19.0-py3-none-any.whl (27 kB)
Requirement already satisfied: six>=1.15.0 in /usr/local/lib/python3.7/dist-packages (from spotipy) (1.15.0)
Collecting urllib3>=1.26.0
  Downloading urllib3-1.26.8-py2.py3-none-any.whl (138 kB)
[K     |████████████████████████████████| 138 kB 14.7 MB/s 
[?25hCollecting requests>=2.25.0
  Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)
[K     |████████████████████████████████| 63 kB 687 kB/s 
[?25hRequirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/dist-packages (from requests>=2.25.0->spotipy) (2.0.11)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.25.0->spotipy) (2021.10.8)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.25.0->spotipy) (2.10)
Installing collected packages: urllib3, requests, spotipy
  Attempting uninstall: urllib3
    Found existing installation: urllib3 1.24.3
    Uninstalling urllib3-1.24.3:
      Successfully uninstalled urllib3-1.24.3
  Attempting uninstall: requests
    Found existing installation: requests 2.23.0
    Uninstalling requests-2.23.0:
      Successfully uninstalled requests-2.23.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.27.1 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
Successfully installed requests-2.27.1 spotipy-2.19.0 urllib3-1.26.8

And then import

from spotipy import client
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials, SpotifyOAuth
import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Setup developer account¶

You'll need to visit this link to setup a developer account, then fill in your authorization information below

SPOTIPY_CLIENT_ID = ""
SPOTIPY_CLIENT_SECRET = ""

Top 10 tracks of an artist¶

We can grab the first 10 tracks of Led Zepplin:

alist = ['elephant', 'pinecone', 'toothbrush']
for index, item in enumerate(alist):
  print(str(index) + ' ' + item)

0 elephant
1 pinecone
2 toothbrush

lz_uri = 'spotify:artist:36QJpDe2go2KgaRleHCDTp'

spotify = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(
    client_id=SPOTIPY_CLIENT_ID,
    client_secret=SPOTIPY_CLIENT_SECRET
))

results = spotify.artist_top_tracks(lz_uri)
ids = []

for i, track in enumerate(results['tracks']):
    ids.append(track['id'])
    if i < 10:
      print('track    : ' + track['name'])
      print('audio    : ' + track['preview_url'])
      print('cover art: ' + track['album']['images'][0]['url'])
      print()

track    : Stairway to Heaven - Remaster
audio    : https://p.scdn.co/mp3-preview/8226164717312bc411f8635580562d67e191a754?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : Immigrant Song - Remaster
audio    : https://p.scdn.co/mp3-preview/8455599677a13017978dcd3f4b210937f0a16bcb?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b27390a50cfe99a4c19ff3cbfbdb

track    : Whole Lotta Love - 1990 Remaster
audio    : https://p.scdn.co/mp3-preview/ce11b19a4d2de9976d7626df0717d0073863909c?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273fc4f17340773c6c3579fea0d

track    : Black Dog - Remaster
audio    : https://p.scdn.co/mp3-preview/9b76619fd9d563a48d38cc90ca00c3008327b52e?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : Kashmir - Remaster
audio    : https://p.scdn.co/mp3-preview/f3ca68c9ceaa3435d5bd55c0199ba0b09b916cce?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273765b0617b572bdd1dbdc7d8e

track    : Ramble On - 1990 Remaster
audio    : https://p.scdn.co/mp3-preview/83383aceb01ea27b0bffdedfaebe55e29b33aca2?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273fc4f17340773c6c3579fea0d

track    : Rock and Roll - Remaster
audio    : https://p.scdn.co/mp3-preview/e7ea8a13f7caf6942c5447e9cd96aac2a076d85a?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : Going to California - Remaster
audio    : https://p.scdn.co/mp3-preview/4bdae56c6a9f7a8ec42b753cb7bea2c77ec68f1e?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : Good Times Bad Times - 1993 Remaster
audio    : https://p.scdn.co/mp3-preview/c1f024eb57b569b926c8e68cab0a6056dc7d9654?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b2736f2f499c1df1f210c9b34b32

track    : D'yer Mak'er - Remaster
audio    : https://p.scdn.co/mp3-preview/863a26744fa4389f1dc61557133df3453be82d7b?cid=93cef3f9255042d7854a6014e0929504
cover art: https://i.scdn.co/image/ab67616d0000b2731816adce1d49e35d3ce9a1d1

the top tracks API only gives the top 10 tracks by an artist:

len(ids)

ids

['5CQ30WqJwcep0pYcV4AMNc',
 '78lgmZwycJ3nzsdgmPPGNx',
 '0hCB0YR03f6AmQaHbwWDe8',
 '3qT4bUD1MaWpGrTwcvguhb',
 '6Vjk8MNXpQpi0F4BefdTyq',
 '3MODES4TNtygekLl146Dxd',
 '4PRGxHpCpF2yoOHYKQIEwD',
 '70gbuMqwNBE2Y5rkQJE9By',
 '0QwZfbw26QeUoIy82Z2jYp',
 '4ItljeeAXtHsnsnnQojaO2']

Spotify has an audio features API that can be used for ML or data visualization:

features = spotify.audio_features(ids)
features

[{'acousticness': 0.58,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/5CQ30WqJwcep0pYcV4AMNc',
  'danceability': 0.338,
  'duration_ms': 482830,
  'energy': 0.34,
  'id': '5CQ30WqJwcep0pYcV4AMNc',
  'instrumentalness': 0.0032,
  'key': 9,
  'liveness': 0.116,
  'loudness': -12.049,
  'mode': 0,
  'speechiness': 0.0339,
  'tempo': 82.433,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/5CQ30WqJwcep0pYcV4AMNc',
  'type': 'audio_features',
  'uri': 'spotify:track:5CQ30WqJwcep0pYcV4AMNc',
  'valence': 0.197},
 {'acousticness': 0.013,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/78lgmZwycJ3nzsdgmPPGNx',
  'danceability': 0.564,
  'duration_ms': 146250,
  'energy': 0.932,
  'id': '78lgmZwycJ3nzsdgmPPGNx',
  'instrumentalness': 0.169,
  'key': 11,
  'liveness': 0.349,
  'loudness': -10.068,
  'mode': 1,
  'speechiness': 0.0554,
  'tempo': 112.937,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/78lgmZwycJ3nzsdgmPPGNx',
  'type': 'audio_features',
  'uri': 'spotify:track:78lgmZwycJ3nzsdgmPPGNx',
  'valence': 0.619},
 {'acousticness': 0.0484,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/0hCB0YR03f6AmQaHbwWDe8',
  'danceability': 0.412,
  'duration_ms': 333893,
  'energy': 0.902,
  'id': '0hCB0YR03f6AmQaHbwWDe8',
  'instrumentalness': 0.131,
  'key': 9,
  'liveness': 0.405,
  'loudness': -11.6,
  'mode': 1,
  'speechiness': 0.405,
  'tempo': 89.74,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/0hCB0YR03f6AmQaHbwWDe8',
  'type': 'audio_features',
  'uri': 'spotify:track:0hCB0YR03f6AmQaHbwWDe8',
  'valence': 0.422},
 {'acousticness': 0.396,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/3qT4bUD1MaWpGrTwcvguhb',
  'danceability': 0.437,
  'duration_ms': 295387,
  'energy': 0.864,
  'id': '3qT4bUD1MaWpGrTwcvguhb',
  'instrumentalness': 0.0314,
  'key': 4,
  'liveness': 0.242,
  'loudness': -7.842,
  'mode': 0,
  'speechiness': 0.0904,
  'tempo': 81.394,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/3qT4bUD1MaWpGrTwcvguhb',
  'type': 'audio_features',
  'uri': 'spotify:track:3qT4bUD1MaWpGrTwcvguhb',
  'valence': 0.749},
 {'acousticness': 0.452,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/6Vjk8MNXpQpi0F4BefdTyq',
  'danceability': 0.483,
  'duration_ms': 517125,
  'energy': 0.615,
  'id': '6Vjk8MNXpQpi0F4BefdTyq',
  'instrumentalness': 0.000414,
  'key': 2,
  'liveness': 0.0512,
  'loudness': -8.538,
  'mode': 1,
  'speechiness': 0.0497,
  'tempo': 80.576,
  'time_signature': 3,
  'track_href': 'https://api.spotify.com/v1/tracks/6Vjk8MNXpQpi0F4BefdTyq',
  'type': 'audio_features',
  'uri': 'spotify:track:6Vjk8MNXpQpi0F4BefdTyq',
  'valence': 0.594},
 {'acousticness': 0.072,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/3MODES4TNtygekLl146Dxd',
  'danceability': 0.468,
  'duration_ms': 263333,
  'energy': 0.607,
  'id': '3MODES4TNtygekLl146Dxd',
  'instrumentalness': 0.000852,
  'key': 9,
  'liveness': 0.225,
  'loudness': -11.367,
  'mode': 1,
  'speechiness': 0.0336,
  'tempo': 98.429,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/3MODES4TNtygekLl146Dxd',
  'type': 'audio_features',
  'uri': 'spotify:track:3MODES4TNtygekLl146Dxd',
  'valence': 0.886},
 {'acousticness': 0.000582,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/4PRGxHpCpF2yoOHYKQIEwD',
  'danceability': 0.317,
  'duration_ms': 220561,
  'energy': 0.887,
  'id': '4PRGxHpCpF2yoOHYKQIEwD',
  'instrumentalness': 0.00258,
  'key': 9,
  'liveness': 0.0891,
  'loudness': -7.292,
  'mode': 1,
  'speechiness': 0.0375,
  'tempo': 169.613,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/4PRGxHpCpF2yoOHYKQIEwD',
  'type': 'audio_features',
  'uri': 'spotify:track:4PRGxHpCpF2yoOHYKQIEwD',
  'valence': 0.871},
 {'acousticness': 0.943,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/70gbuMqwNBE2Y5rkQJE9By',
  'danceability': 0.503,
  'duration_ms': 212161,
  'energy': 0.265,
  'id': '70gbuMqwNBE2Y5rkQJE9By',
  'instrumentalness': 0.045,
  'key': 2,
  'liveness': 0.0867,
  'loudness': -15.913,
  'mode': 1,
  'speechiness': 0.0333,
  'tempo': 78.044,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/70gbuMqwNBE2Y5rkQJE9By',
  'type': 'audio_features',
  'uri': 'spotify:track:70gbuMqwNBE2Y5rkQJE9By',
  'valence': 0.522},
 {'acousticness': 0.0382,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/0QwZfbw26QeUoIy82Z2jYp',
  'danceability': 0.476,
  'duration_ms': 166267,
  'energy': 0.717,
  'id': '0QwZfbw26QeUoIy82Z2jYp',
  'instrumentalness': 7.61e-05,
  'key': 9,
  'liveness': 0.0818,
  'loudness': -9.192,
  'mode': 1,
  'speechiness': 0.0949,
  'tempo': 93.584,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/0QwZfbw26QeUoIy82Z2jYp',
  'type': 'audio_features',
  'uri': 'spotify:track:0QwZfbw26QeUoIy82Z2jYp',
  'valence': 0.753},
 {'acousticness': 0.262,
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/4ItljeeAXtHsnsnnQojaO2',
  'danceability': 0.525,
  'duration_ms': 262748,
  'energy': 0.929,
  'id': '4ItljeeAXtHsnsnnQojaO2',
  'instrumentalness': 2.9e-05,
  'key': 9,
  'liveness': 0.0754,
  'loudness': -8.56,
  'mode': 0,
  'speechiness': 0.0784,
  'tempo': 163.503,
  'time_signature': 4,
  'track_href': 'https://api.spotify.com/v1/tracks/4ItljeeAXtHsnsnnQojaO2',
  'type': 'audio_features',
  'uri': 'spotify:track:4ItljeeAXtHsnsnnQojaO2',
  'valence': 0.556}]

pd.DataFrame(features)

	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo	type	id	uri	track_href	analysis_url	duration_ms	time_signature
0	0.338	0.340	9	-12.049	0	0.0339	0.580000	0.003200	0.1160	0.197	82.433	audio_features	5CQ30WqJwcep0pYcV4AMNc	spotify:track:5CQ30WqJwcep0pYcV4AMNc	https://api.spotify.com/v1/tracks/5CQ30WqJwcep...	https://api.spotify.com/v1/audio-analysis/5CQ3...	482830	4
1	0.564	0.932	11	-10.068	1	0.0554	0.013000	0.169000	0.3490	0.619	112.937	audio_features	78lgmZwycJ3nzsdgmPPGNx	spotify:track:78lgmZwycJ3nzsdgmPPGNx	https://api.spotify.com/v1/tracks/78lgmZwycJ3n...	https://api.spotify.com/v1/audio-analysis/78lg...	146250	4
2	0.412	0.902	9	-11.600	1	0.4050	0.048400	0.131000	0.4050	0.422	89.740	audio_features	0hCB0YR03f6AmQaHbwWDe8	spotify:track:0hCB0YR03f6AmQaHbwWDe8	https://api.spotify.com/v1/tracks/0hCB0YR03f6A...	https://api.spotify.com/v1/audio-analysis/0hCB...	333893	4
3	0.437	0.864	4	-7.842	0	0.0904	0.396000	0.031400	0.2420	0.749	81.394	audio_features	3qT4bUD1MaWpGrTwcvguhb	spotify:track:3qT4bUD1MaWpGrTwcvguhb	https://api.spotify.com/v1/tracks/3qT4bUD1MaWp...	https://api.spotify.com/v1/audio-analysis/3qT4...	295387	4
4	0.483	0.615	2	-8.538	1	0.0497	0.452000	0.000414	0.0512	0.594	80.576	audio_features	6Vjk8MNXpQpi0F4BefdTyq	spotify:track:6Vjk8MNXpQpi0F4BefdTyq	https://api.spotify.com/v1/tracks/6Vjk8MNXpQpi...	https://api.spotify.com/v1/audio-analysis/6Vjk...	517125	3
5	0.468	0.607	9	-11.367	1	0.0336	0.072000	0.000852	0.2250	0.886	98.429	audio_features	3MODES4TNtygekLl146Dxd	spotify:track:3MODES4TNtygekLl146Dxd	https://api.spotify.com/v1/tracks/3MODES4TNtyg...	https://api.spotify.com/v1/audio-analysis/3MOD...	263333	4
6	0.317	0.887	9	-7.292	1	0.0375	0.000582	0.002580	0.0891	0.871	169.613	audio_features	4PRGxHpCpF2yoOHYKQIEwD	spotify:track:4PRGxHpCpF2yoOHYKQIEwD	https://api.spotify.com/v1/tracks/4PRGxHpCpF2y...	https://api.spotify.com/v1/audio-analysis/4PRG...	220561	4
7	0.503	0.265	2	-15.913	1	0.0333	0.943000	0.045000	0.0867	0.522	78.044	audio_features	70gbuMqwNBE2Y5rkQJE9By	spotify:track:70gbuMqwNBE2Y5rkQJE9By	https://api.spotify.com/v1/tracks/70gbuMqwNBE2...	https://api.spotify.com/v1/audio-analysis/70gb...	212161	4
8	0.476	0.717	9	-9.192	1	0.0949	0.038200	0.000076	0.0818	0.753	93.584	audio_features	0QwZfbw26QeUoIy82Z2jYp	spotify:track:0QwZfbw26QeUoIy82Z2jYp	https://api.spotify.com/v1/tracks/0QwZfbw26QeU...	https://api.spotify.com/v1/audio-analysis/0QwZ...	166267	4
9	0.525	0.929	9	-8.560	0	0.0784	0.262000	0.000029	0.0754	0.556	163.503	audio_features	4ItljeeAXtHsnsnnQojaO2	spotify:track:4ItljeeAXtHsnsnnQojaO2	https://api.spotify.com/v1/tracks/4ItljeeAXtHs...	https://api.spotify.com/v1/audio-analysis/4Itl...	262748	4

Add additional artists¶

Let's get some other artist data to compare with:

artist_dict = {'zep': '36QJpDe2go2KgaRleHCDTp',
               'tswift': '06HL4z0CvFAxyc27GXpf02',
               'debussy': '1Uff91EOsvd99rtAupatMP',
               'luttrell': '4EOyJnoiiOJ4vuNhSBArB2',
               'johnwill': '3dRfiJ2650SZu6GbydcHNb'}

color_dict = {'zep': 'tab:blue',
               'tswift': 'tab:green',
               'debussy': 'tab:orange',
               'luttrell': 'tab:red',
               'johnwill': 'tab:pink'}

for artist, uri in artist_dict.items():
  print(artist + ' ' + uri)

zep 36QJpDe2go2KgaRleHCDTp
tswift 06HL4z0CvFAxyc27GXpf02
debussy 1Uff91EOsvd99rtAupatMP
luttrell 4EOyJnoiiOJ4vuNhSBArB2
johnwill 3dRfiJ2650SZu6GbydcHNb

ids = []
artists = []
colors = []
for artist, uri in artist_dict.items():
  results = spotify.artist_top_tracks('spotify:artist:' + uri)


  for i, track in enumerate(results['tracks']):
      ids.append(track['id'])
      artists.append(artist)
      colors.append(color_dict[artist])
      if i < 1:
        print('track    : ' + track['name'])
        print('cover art: ' + track['album']['images'][0]['url'])
        print()

track    : Stairway to Heaven - Remaster
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : All Too Well (10 Minute Version) (Taylor's Version) (From The Vault)
cover art: https://i.scdn.co/image/ab67616d0000b273318443aab3531a0558e79a4d

track    : Clair de Lune, L. 32
cover art: https://i.scdn.co/image/ab67616d0000b2736e7bb273ff9cb1de1e1d4d0a

track    : Twin Souls
cover art: https://i.scdn.co/image/ab67616d0000b2735dea3da9d2751a0fa7b23fd3

track    : Carol of the Bells
cover art: https://i.scdn.co/image/ab67616d0000b273a68c06155b7c3cf82b00cb96

features = spotify.audio_features(ids)
df = pd.DataFrame(features)
df['artist'] = artists
df['color'] = colors
feat_names = df.columns[:11]
print(feat_names)

Index(['danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness',
       'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo'],
      dtype='object')

df.head()

	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo	type	id	uri	track_href	analysis_url	duration_ms	time_signature	artist	color
0	0.338	0.340	9	-12.049	0	0.0339	0.5800	0.003200	0.1160	0.197	82.433	audio_features	5CQ30WqJwcep0pYcV4AMNc	spotify:track:5CQ30WqJwcep0pYcV4AMNc	https://api.spotify.com/v1/tracks/5CQ30WqJwcep...	https://api.spotify.com/v1/audio-analysis/5CQ3...	482830	4	zep	tab:blue
1	0.564	0.932	11	-10.068	1	0.0554	0.0130	0.169000	0.3490	0.619	112.937	audio_features	78lgmZwycJ3nzsdgmPPGNx	spotify:track:78lgmZwycJ3nzsdgmPPGNx	https://api.spotify.com/v1/tracks/78lgmZwycJ3n...	https://api.spotify.com/v1/audio-analysis/78lg...	146250	4	zep	tab:blue
2	0.412	0.902	9	-11.600	1	0.4050	0.0484	0.131000	0.4050	0.422	89.740	audio_features	0hCB0YR03f6AmQaHbwWDe8	spotify:track:0hCB0YR03f6AmQaHbwWDe8	https://api.spotify.com/v1/tracks/0hCB0YR03f6A...	https://api.spotify.com/v1/audio-analysis/0hCB...	333893	4	zep	tab:blue
3	0.437	0.864	4	-7.842	0	0.0904	0.3960	0.031400	0.2420	0.749	81.394	audio_features	3qT4bUD1MaWpGrTwcvguhb	spotify:track:3qT4bUD1MaWpGrTwcvguhb	https://api.spotify.com/v1/tracks/3qT4bUD1MaWp...	https://api.spotify.com/v1/audio-analysis/3qT4...	295387	4	zep	tab:blue
4	0.483	0.615	2	-8.538	1	0.0497	0.4520	0.000414	0.0512	0.594	80.576	audio_features	6Vjk8MNXpQpi0F4BefdTyq	spotify:track:6Vjk8MNXpQpi0F4BefdTyq	https://api.spotify.com/v1/tracks/6Vjk8MNXpQpi...	https://api.spotify.com/v1/audio-analysis/6Vjk...	517125	3	zep	tab:blue

Visualize the audio features¶

# <groupby>.<select>.<agg operation>
df.groupby('artist')[feat_names].mean()

	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo
artist
debussy	0.2868	0.019716	3.6	-30.3510	0.8	0.04209	0.990800	0.927000	0.07817	0.07747	81.2996
johnwill	0.2783	0.151770	3.9	-20.4347	0.5	0.03766	0.903000	0.615920	0.18085	0.19182	87.2151
luttrell	0.5310	0.706000	4.5	-9.3468	0.7	0.03504	0.127682	0.669259	0.18714	0.16701	117.5059
tswift	0.5073	0.621700	4.2	-6.4014	0.9	0.05926	0.198158	0.000263	0.13591	0.40550	125.7815
zep	0.4523	0.705800	7.3	-10.2421	0.7	0.09121	0.280518	0.038355	0.17212	0.61690	105.0253

fig, ax = plt.subplots(figsize=(10,10))
df.groupby('artist')[feat_names].mean().plot(kind='barh', ax=ax)

<matplotlib.axes._subplots.AxesSubplot at 0x7f7e11316ad0>

png

we see that it is difficult to see all the features together at once. One tactic may be to normalize the features

# (x - x.min()) / (x.max() - x.min())

scaled = df[feat_names].apply(lambda x: (x - x.mean()) / (x.std()))
scaled['artist'] = df['artist']

fig, ax = plt.subplots(figsize=(10,10))
scaled.groupby('artist')[feat_names].mean().plot(kind='barh', ax=ax)

<matplotlib.axes._subplots.AxesSubplot at 0x7f9c3412a390>

png

We can then investigate if these scaled and centered features separate out under a dimensionality reduction (a topic we explore in unsupervised learning):

scaled[feat_names].head()

	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo
0	-0.469349	-0.300769	1.218214	0.351423	-1.587451	-0.342620	0.186938	-1.068262	-0.305866	-0.356202	-0.626640
1	0.980922	1.462203	1.784825	0.561988	0.617342	0.042005	-1.138510	-0.671989	1.739793	1.230425	0.286535
2	0.005519	1.372864	1.218214	0.399148	0.617342	6.296175	-1.055757	-0.762811	2.231454	0.489749	-0.407896
3	0.165947	1.259700	-0.198314	0.798594	-1.587451	0.668137	-0.243189	-1.000862	0.800371	1.719197	-0.657743
4	0.461135	0.518179	-0.764925	0.724615	0.617342	-0.059966	-0.112281	-1.074920	-0.874787	1.136431	-0.682231

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(scaled[feat_names])

PCA(n_components=2)

color_dict

{'debussy': 'tab:orange',
 'johnwill': 'tab:pink',
 'luttrell': 'tab:red',
 'tswift': 'tab:green',
 'zep': 'tab:blue'}

X_pca = pca.transform(df[feat_names])

plt.scatter(X_pca[:, 0], X_pca[:, 1], alpha=0.8, c=df['color'].values,
            edgecolor='grey')
plt.xlabel('First PC')
plt.ylabel('Second PC')

Text(0, 0.5, 'Second PC')

png

🏋️‍♀️ Exercises¶

🕵️‍♀️ Exercise 1: Find artist urls and build dataset¶

a. Navigate to the Spotify web application, pick 5 artists and update the the dictionary below

import matplotlib.colors as mcolors
mcolors.TABLEAU_COLORS

OrderedDict([('tab:blue', '#1f77b4'),
             ('tab:orange', '#ff7f0e'),
             ('tab:green', '#2ca02c'),
             ('tab:red', '#d62728'),
             ('tab:purple', '#9467bd'),
             ('tab:brown', '#8c564b'),
             ('tab:pink', '#e377c2'),
             ('tab:gray', '#7f7f7f'),
             ('tab:olive', '#bcbd22'),
             ('tab:cyan', '#17becf')])

# Cell for 1.a

artist_dict = {'zep': '36QJpDe2go2KgaRleHCDTp',
               'tswift': '06HL4z0CvFAxyc27GXpf02',
               'debussy': '1Uff91EOsvd99rtAupatMP',
               'luttrell': '4EOyJnoiiOJ4vuNhSBArB2',
               'johnwill': '3dRfiJ2650SZu6GbydcHNb',
               'chopin': '7y97mc3bZRFXzT2szRM4L4',
               '2pac': '1ZwdS5xdxEREPySFridCfh',
               'ganja': '1a6oIpEh4DGgaqgWg5xwd3'}

color_dict = {'zep': 'tab:blue',
               'tswift': 'tab:green',
               'debussy': 'tab:orange',
               'luttrell': 'tab:red',
               'johnwill': 'tab:pink',
               'chopin': 'tab:brown',
               '2pac': 'tab:gray',
               'ganja': 'tab:purple'}

b. build feature set using spotify.artist_top_tracks('spotify:artist:' + uri) and storing the resultant track information

# Cell for 1.b
ids = []
artists = []
colors = []
for artist, uri in artist_dict.items():
  results = spotify.artist_top_tracks('spotify:artist:' + uri)


  for i, track in enumerate(results['tracks']):
      ids.append(track['id'])
      artists.append(artist)
      colors.append(color_dict[artist])
      if i < 1:
        print('track    : ' + track['name'])
        print('cover art: ' + track['album']['images'][0]['url'])
        print()

track    : Stairway to Heaven - Remaster
cover art: https://i.scdn.co/image/ab67616d0000b273c8a11e48c91a982d086afc69

track    : All Too Well (10 Minute Version) (Taylor's Version) (From The Vault)
cover art: https://i.scdn.co/image/ab67616d0000b273318443aab3531a0558e79a4d

track    : Clair de Lune, L. 32
cover art: https://i.scdn.co/image/ab67616d0000b2736e7bb273ff9cb1de1e1d4d0a

track    : Twin Souls
cover art: https://i.scdn.co/image/ab67616d0000b2735dea3da9d2751a0fa7b23fd3

track    : Carol of the Bells
cover art: https://i.scdn.co/image/ab67616d0000b273a68c06155b7c3cf82b00cb96

track    : Nocturne No. 2 in E-Flat Major, Op. 9 No. 2
cover art: https://i.scdn.co/image/ab67616d0000b27355c82855070525581e2c6fee

track    : Hit 'Em Up - Single Version
cover art: https://i.scdn.co/image/ab67616d0000b273d81a092eb373ded457d94eec

track    : Miss You
cover art: https://i.scdn.co/image/ab67616d0000b273bbfbb9cd91d8676f68676e59

features = spotify.audio_features(ids)
df = pd.DataFrame(features)
df['artist'] = artists
df['color'] = colors
feat_names = df.columns[:11]
print(feat_names)

Index(['danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness',
       'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo'],
      dtype='object')

💫 Exercise 2: Visualize Features¶

a. Create a boxplot of each feature, grouped by artist (similar to example above)

# Cell for 2.a
fig, ax = plt.subplots(6, 2, figsize=(10,25))
indices = np.argwhere(ax)
for index, feature in enumerate(feat_names):
  df.boxplot(by='artist', column=feature, ax=ax[indices[index][0],
                                                indices[index][1]])
plt.tight_layout()

png

b. Normalize the features this time, then create the boxplot

# Cell for 2.b
scaled = df[feat_names].apply(lambda x: (x - x.mean()) / (x.std()))
scaled['artist'] = df['artist']

fig, ax = plt.subplots(6, 2, figsize=(10,25))
indices = np.argwhere(ax)
for index, feature in enumerate(feat_names):
  scaled.boxplot(by='artist', column=feature, ax=ax[indices[index][0],
                                                indices[index][1]])
plt.tight_layout()

png

⚖️ Exercise 3: Write a function that returns a similarity score between two artists¶

# Cell for 3
avg = df.groupby('artist')[feat_names].mean()
avg

	danceability	energy	key	loudness	mode	speechiness	acousticness	instrumentalness	liveness	valence	tempo
artist
2pac	0.8102	0.692400	5.0	-6.2682	0.8	0.18134	0.097199	0.000569	0.20875	0.62680	100.6579
chopin	0.3255	0.018298	4.6	-30.1471	0.5	0.04479	0.988800	0.906600	0.08981	0.10532	86.1455
debussy	0.2868	0.019716	3.6	-30.3510	0.8	0.04209	0.990800	0.927000	0.07817	0.07747	81.2996
ganja	0.6506	0.701500	5.2	-5.0898	0.3	0.08868	0.083128	0.114312	0.12767	0.40886	108.9341
johnwill	0.2783	0.151770	3.9	-20.4347	0.5	0.03766	0.903000	0.615920	0.18085	0.19182	87.2151
luttrell	0.5310	0.706000	4.5	-9.3468	0.7	0.03504	0.127682	0.669259	0.18714	0.16701	117.5059
tswift	0.5073	0.621700	4.2	-6.4014	0.9	0.05926	0.198158	0.000263	0.13591	0.40550	125.7815
zep	0.4523	0.705800	7.3	-10.2421	0.7	0.09121	0.280518	0.038355	0.17212	0.61690	105.0253

x = avg.loc['2pac'].values
y = avg.loc['zep'].values

def distance(xlabel, ylabel):
  x = avg.loc[xlabel].values
  y = avg.loc[ylabel].values
  return np.sqrt(((x-y)**2).sum())

distance('2pac', 'ganja')

8.38304605268043

distance('chopin', 'debussy')

4.9615683289484185

scores = np.zeros((avg.shape[0], avg.shape[0]))
for index, name in enumerate(avg.index):
  for index2, name2 in enumerate(avg.index):
    if index < 1:
      print('{:.2f}'.format(distance(name, name2)) + ' ' + name + ' ' + name2 + ' ')
    scores[index, index2] = distance(name,name2)

0.00 2pac 2pac 
27.99 2pac chopin 
30.97 2pac debussy 
8.38 2pac ganja 
19.61 2pac johnwill 
17.16 2pac luttrell 
25.14 2pac tswift 
6.35 2pac zep

dists = pd.DataFrame(scores, index=avg.index, columns=avg.index)

import seaborn as sns

sns.heatmap(dists)

<matplotlib.axes._subplots.AxesSubplot at 0x7f9c32ee0190>

png

import numpy as np
fig, ax = plt.subplots(1, 1, figsize = (10,10))

# create a mask to white-out the upper triangle
mask = np.triu(np.ones_like(dists, dtype=bool))

# we'll want a divergent colormap for this so our eye
# is not attracted to the values close to 0
cmap = sns.diverging_palette(230, 20, as_cmap=True)

sns.heatmap(dists, mask=mask, cmap=cmap, ax=ax)

<matplotlib.axes._subplots.AxesSubplot at 0x7f9c32a1b450>

png

Python Foundations X1: The Spotify API¶