neo4j Chases the Blues Away

Published Jul 24, 2017
neo4j Chases the Blues Away

The latest model for my Music Album graph database survived a few bumps, and I'd been happy with it for the most part. I did mention that issue with Airto Moreira playing two instruments on the same Bitches Brew track, and decided, maybe against my better judgment, to create two relationships, one for each instrument. So his portion of the Bitches Brew album looked like this:
Each :PLAYED_ON relationship had the attribute {tracks:[6]}, one of them had {instrument: "Percussion"} and the other {instrument: "Cuica"}. That worked, even if I wasn't entirely happy with it, as if I had a bad feeling about it.

As if on cue, Al Di Meola came around with his “versatility”, playing several instruments, tracks unspecified, on Casino, Elegant Gypsy, and Orange and Blue. Not to mention the several instruments he also played on Land of the Midnight Sun.

Time to change the model again, else I would have maybe two dozen arrows coming out of Di Meola. This time, the instrument attribute would be a list, renamed to instruments. Thus,

MERGE (M:Musician {name: "Al Di Meola"})
WITH M MATCH(A:Album {name: "Orange and Blue"})
MERGE (M)-[:PLAYED_ON {instruments: ["Acoustic and Electric Guitars", "Percussion", "Drums", "Synthesizers", "Grand piano", "Other instruments"]}]->(A);

To be consistent, and to ease queries, I went through the other relationships and converted the {instrument} attribute into a list, thusly: MATCH (:Musician)-[r:PLAYED_ON]->() SET r.instruments = [r.instrument], followed by MATCH (:Musician)-[r:PLAYED_ON]->() SET r.instrument = NULL. That simply created a new instruments attribute, consisting of a list containing the value of the relationship's instrument attribute. Yeah, that was kinda hard to read — instruments is a list, instrument is a single value. The second command cleared out the now-unnecessary instrument attribute.

Note that I didn't include the type of the node that the [:PLAYED_ON] relationship pointed to, since in this database, all [:PLAYED_ON] relationships terminated in (:Album) nodes. For that matter, I didn't need to include the type of the first node either.

Thus, e.g., Miles Davis's relationship with Kind of Blue becomes [:PLAYED_ON {instruments: ["Trumpet"]}].

On the other hand, one of Di Meola's relationships is (M:Musician {name: "Al Di Meola"})-[:PLAYED_ON {instruments: ["6-string Acoustic Guitar", "Gong"], tracks: [6]}]->(A:Album {name: "Land of the Midnight Sun"}), meaning he played the 6-string acoustic guitar and the gong on track 6 of the album Land of the Midnight Sun.

After the necessary MERGE, MATCH, and MERGE operations, my graph now looks like this:
The model looks more elaborate than necessary. For instance, I didn't really have to identify what instruments each musician played on what tracks. On the other hand, including that data does increase the information now available.

For instance, if I were listening to Bitches Brew and I wanted to know who played the Soprano Sax on track 3, that's a simple MATCH (M:Musician)-[R:PLAYED_ON]->(A:Album {name: "Bitches Brew"}) WHERE "Soprano Sax" IN R.instruments AND 3 IN R.tracks.

I could also check for musicians who played together with Al Di Meola on more than one album. I also have to account for musicians who play on all tracks, since the tracks attribute of [:PLAYED_ON] will be NULL. More on those next time.

For now, I'll address the uniqueness of the relationships between musicians and albums — i.e., there's only the [:PLAYED_ON] relationship. In some cases, that's an understatement, e.g., Stanley Clarke, Al Di Meola, and Jean-Luc Ponty co-produced The Rite of Strings. Now that is another possible relationship. The publisher is another possible node, its relationship with an album being something like [:PUBLISHED]. I just might add more nodes and relationships, after I've recovered from developing this database, and writing about it.

Discover and read more posts from Daniel Escasa
get started