OK, Google...I mean, neo4j...
Time to put my MusicAlbum graph database through its paces.
Who played with Al Di Meola on more than one album?
MATCH (DiM:Musician {name: "Al Di Meola"})-[]->(A:Album)<-[]-(M1:Musician)
WHERE DiM.name<>M1.name
WITH collect(DISTINCT A.name) as CommonAlbums, M1.name as CommonPlayers
WHERE size(CommonAlbums) > 1
RETURN CommonPlayers, CommonAlbums;
I excluded Di Meola from the query (WHERE DiM.name<>M1.name
), and had to collect(DISTINCT A.name) as CommonAlbums
because Land of the Midnight Sun was showing up six times — once for each track on which Di Meola plays a different instrument.
Maybe I didn't need the DISTINCT
keyword for the other albums, but having a single query template makes things a whole lot easier.
I also omitted the [:PLAYED_ON]
relationship label since that's the only relationship in the database so far. When (if??) I finally decide to include [:PUBLISHED]
or [:PRODUCED]
relationships, I'll have to go back and include [:PLAYED_ON]
in the query.
In any event...collect
, as the name implies, collects the resulting album names and constructs a list consisting of the album names, naming the list — what else? — CommonAlbums. DISTINCT
removes the duplicates, and size()
counts the number of album names in CommonAlbums. When size(CommonAlbums)
is greater than 1, that means that the musician and Di Meola shared more than one album.
I also had to return M1.name as CommonPlayers
because M1's scope (the neo4j manual refers to this as ‘context’) ended before the [WHERE size(CommonAlbums) > 1
— i.e., M1 was no longer accessible and I couldn't RETURN M1.name
after the WITH
.
Here's an easy question:
On what albums did Lenny White play?
MATCH (:Musician {name: "Lenny White"})-[]->(A:Album)
RETURN A.name;
UPDATE:
You can count on Di Meola to complicate the database. Thanks to his six different instruments on Land of the Midnight Sun, a query on the albums he played on returns six occurrences of that album. Still, no major snag, a simple DISTINCT
will take care of that:
On what albums did Al Di Meola play?
MATCH (:Musician {name: "Al Di Meola"})-[]->(A:Album)
RETURN DISTINCT A.name as `Albums Di Meola played on`;
Which musicians played on both In a Silent Way and Bitches Brew?
MATCH (:Album {name: "In a Silent Way"})<-[]-(M:Musician)-[]->(:Album {name: "Bitches Brew"})
RETURN DISTINCT M.name;
I could even have made the output a little friendlier:
MATCH (:Album {name: "In a Silent Way"})<-[]-(M:Musician)-[]->(:Album {name: "Bitches Brew"})
RETURN DISTINCT M.name as `Played on both In a Silent Way and Bitches Brew`;
Let's get a little ambitious:
Which musicians played any kind of piano on Bitches Brew?
MATCH (M:Musician)-[r:PLAYED_ON]->(:Album {name: "Bitches Brew"})
WITH M as Musician,
FILTER(ins in r.instruments WHERE ins CONTAINS "Piano") as insFilt
WHERE insFilt <> []
RETURN Musician.name, insFilt;
This presented a challenge because three musicians played electric piano — Joe Zawinul on the left, Larry Young in the center, and Chick Corea on the right. Their respective instruments are “Electric Piano (left)”, “Elecric Piano (center)”, and “Electric Piano (right)”. A simple MATCH ... [:PLAYED_ON {instruments: ["Piano"]}]
would not have worked — the FILTER
and the CONTAINS
did the trick, although I could've made a case-insensitive CONTAINS
.
I also had to qualify the FILTER
with WHERE insFilt <> []
because the query was returning empty lists.
Which musicians played on track four of Stardust?
MATCH (M:Musician)-[r:PLAYED_ON]->(:Album {name: "Stardust"})
RETURN
CASE
WHEN r.tracks IS NULL THEN M.name
WHEN 4 in r.tracks THEN M.name
END;
Oddly, the query wouldn't run in the browser, returning the literal string “CASE WHEN r.tracks IS NULL THEN M.name WHEN 4 in r.tracks THEN M.name END”, so I had to do this in cypher-shell. I originally had ELSE NULL
before the END
ending the CASE
, got rid of it because I didn't like the look of the NULL in the results but they still came up anyway. Oh well...
UPDATE: Now it returns the musicians, including the NULLs. Go figure.
On second thought, I didn't need CASE
:
MATCH (M:Musician)-[r:PLAYED_ON]->(:Album {name: "Stardust"})
WITH M as mn,
FILTER (mn in M.name
WHERE ((r.tracks IS NULL) OR (4 in r.tracks))) as finalName
WHERE finalName <> []
RETURN finalName;
And it worked in the browser to boot:
And I got rid of the NULL
answers.
I may or may not have mentioned this in the earlier posts in this series: When r.tracks is NULL, that means that the musician played the r.instruments on all tracks. r.tracks IS NULL
therefore implies x in r.tracks
, where x in [1, 2,...,size(r.tracks) + 1].
However, another issue:
MATCH (M:Musician)-[r:PLAYED_ON]->(:Album {name: "Bitches Brew"})
WITH M as mn,
FILTER (mn in M.name WHERE ((r.tracks IS NULL) OR (4 in r.tracks))) as finalName
WHERE finalName <> []
RETURN finalName;
Juma Santos plays on two instruments on Track 4 so he appears twice. No big deal, again, a DISTINCT
will take care of that:
MATCH (M:Musician)-[r:PLAYED_ON]->(:Album {name: "Bitches Brew"})
WITH M as mn,
FILTER (mn in M.name WHERE ((r.tracks IS NULL) OR (4 in r.tracks))) as finalName
WHERE finalName <> []
RETURN DISTINCT finalName;
As I was looking at the results, a nagging issue came up — Miles Davis plays on all Bitches Brew tracks except the 4th. Thus, if any musician played only on the fourth track and none other, technically he didn't play with Davis.
On the other hand, it's my model, and I can very well define playing with a musician as appearing on one of his albums, regardless of whether they actually shared at least one track. So there.
Nevertheless, I checked if any musician played only on track 4, MATCH (M:Musician)-[:PLAYED_ON {tracks:[4]}]->(:Album {name: "Bitches Brew"}) RETURN M.name;
and to my relief got zero rows.
Lastly, in regard to any queries involving a track, you're not likely to remember the track number of a given title. I'd have prefered to use some kind of indexOf() function, which cypher unfortunately does not have as of this writing.
However, I'd like to think of this as a back end, or queries to the database engine that will be passed from a program in, e.g., Java, which would take care of implementing an indexOf() function.
And take that as a teaser for the next posts.