One-to-one SQL join from among the many?

I have two tables in MySQL. One of them (let’s say, for the purpose of simplicity) is a listing of books. The other is a listing of reviews of those books along with a 1-5 ranking. Each book may have 0 or more reviews. It’s your standard one-to-many relationship.

However, I want to have a one-to-one left join in order to generate a listing of books along with their most recent ranking, like such:

Da Vinci Code: 1
The Omnivore’s Dilemma: 5
The Half-Blood Prince: N/A
Freakonomics: 3
Cesar’s Way: 3

Again, this is not the average ranking, simply the one accompanying the most recent review, assuming there is one.

Here’s the question: What’s the simplest method of joining the two tables to include only the ranking from that most recent review? I just know I’m overlooking something simple.

Published by Waldo Jaquith

Waldo Jaquith (JAKE-with) is an open government technologist who lives near Char­lottes­­ville, VA, USA. more »

13 replies on “One-to-one SQL join from among the many?”

  1. I believe you want:

    SELECT LAST(b.book, r.rank) from book AS b, rank AS r WHERE b.id = r.fid ORDER BY r.date

    At least, that is my take without seeing the schema and my limited SQL knowledge. I’m curious if others have a better approach. Note that r.fid assumes that it is a foreign key.

  2. I can’t think of a way to do this with an inner join, but it’s easy to do with a subselect. The following example works under SQL Server, but it’s pretty basic SQL, so I think it’ll all work the same way in MySQL as well:

    create table book (book_name varchar(32))
    create table rank (book_name varchar(32), rank int, date datetime)

    insert book (book_name) values (‘a’)
    insert book (book_name) values (‘b’)
    insert book (book_name) values (‘c’)

    insert rank (book_name, rank, date) values (‘a’, 1, ‘2006-01-01’)
    insert rank (book_name, rank, date) values (‘a’, 2, ‘2006-02-01’)
    insert rank (book_name, rank, date) values (‘a’, 3, ‘2006-03-01’)
    insert rank (book_name, rank, date) values (‘a’, 4, ‘2006-04-01’)

    insert rank (book_name, rank, date) values (‘b’, 4, ‘2006-01-01’)
    insert rank (book_name, rank, date) values (‘b’, 3, ‘2006-02-01’)
    insert rank (book_name, rank, date) values (‘b’, 2, ‘2006-03-01’)
    insert rank (book_name, rank, date) values (‘b’, 1, ‘2006-04-01’)

    select b.book_name,
    (select top 1 rank r from rank r where r.book_name=b.book_name order by r.date)
    from book b
    group by b.book_name

  3. Here is a question for you… why not get the average ranking for the book as opposed to only the last ranking?

    For example:

    Select
    BookId,
    BookName,
    BookPrice,
    BookEtc,
    (Select AVG(Ranking) From Ranking Where fkBookId = BookId) As BookRank
    From Book

    Otherwise you could use a similiar subquery to get only the one rank, i.e.
    (Select Top 1 Ranking From Ranking Where fkBookId = BookId Order By RankingDate DESC)

    Just a thought. Good luck. Having written literally dozens of stored procedures in the last few weeks I sympathize…

    Craig

  4. Here is a question for you… why not get the average ranking for the book as opposed to only the last ranking?

    I’ve presented a simplified version of what I’m actually trying to do here—this is actually a peer-review system for submitted papers, and the editor’s review system needs to be able to see only the last ranking, and then assign the paper to another reviewer to get his opinion. (Wash, rinse repeat.) Suffice it to say, the average ranking is useless to me.

    Tom, your subselect suggestion is a good one — I’m going to give it a whirl. Thanks for taking the time to put that together. I’m still scratching my head over the problem of not being able to do this with a simple join, but that may well indicate my shortcomings more than MySQL’s. :)

  5. How malleable is the current schema? I’m not sure what to tell you if you’re stuck with specifically that which you presented (i.e. no date column), but if you can manipulate it, the date column would solve all your problems. Make it a datetime, and it’s as easy as:

    SELECT book.name, review.rank FROM book, review WHERE book.id = review.book_id ORDER BY review.date DESC LIMIT 1

    I’m confident I’ve seen similar statements in your code many times before, so maybe I misunderstood your question?

  6. ^^ Yeah, if you can add columns, add an id or date column and Max() it out. LAST_INSERT_ID() would also work if you go with an auto-increment column.

  7. You could also create a primary key on the review table and you can choose SELECT MAX(review.ID), review.score from review WHERE book.book_id = review.book_ID…

    That’s probably the easiest solution, I think.

  8. Waldo,
    I’ve done something similar to that. I needed to extract the last value from a flowmeter…I think, or maybe it was a scale, I can’t remember.
    Let me look through some old projects and I’ll get back to you. The final solution was pretty simple, as I recall.
    Alton

  9. The schema is actually quite malleable. I’ve got a primary key (an auto_increment int), and a timestamp.

    The trick is all of the joins:

    SELECT submission.id, submission.title, submission_reader.reader,
    submission_reader.recommendation, submission.genre, submission_author.name AS author,
    DATE_FORMAT(submission_reader.date_modified, "%m/%d/%y") AS date_modified,
    MAX(submission_reader.date_created) AS date_created
    FROM submission_reader
    LEFT JOIN submission ON submission.id = submission_reader.submission_id
    LEFT JOIN submission_author ON submission.author_id = submission_author.id
    WHERE submission_reader.recommendation IS NOT NULL AND submission.status = "under review"
    GROUP BY submission.id
    ORDER BY submission_reader.date_created ASC, submission.title ASC

    The whole query makes my head spin. :)

Comments are closed.