見出し画像

CS50 2023 - Week7 SQL


概要

Week7では、SQLについて学びます。
講義の主な内容は、テーブル、データ型、ステートメント、制約、インデックス、キーワード、関数、トランザクション、レースコンディション、SQLインジェクション攻撃です。


Lab 7

Songs

Spotifyのデータベースから、曲に関する様々な質問に回答するSQLクエリを作成します。

以下は、実際に私が提出したコードです。

1.sql

SELECT name FROM songs;

2.sql

SELECT name FROM songs ORDER BY tempo ASC;

3.sql

SELECT name FROM songs ORDER BY duration_ms DESC LIMIT 5;

4.sql

SELECT name FROM songs WHERE danceability > 0.75 AND energy > 0.75 AND valence > 0.75;

5.sql

SELECT AVG(energy) FROM songs;

6.sql

SELECT songs.name FROM songs
JOIN artists ON songs.artist_id = artists.id
WHERE artists.name = 'Post Malone';

7.sql

SELECT AVG(songs.energy) FROM songs
JOIN artists ON songs.artist_id = artists.id
WHERE artists.name = 'Drake';

8.sql

SELECT name FROM songs WHERE name LIKE '%feat.%';

answers.txt

Below are the energy, valence, and danceability averages of the top 100 songs in the 2018 list.

Energy (0.65906): The listener prefers dynamic and fast-paced music, indicating a liking for energetic and lively songs.

Valence (0.484443): The listener maintains a balance between joyful and brighter songs and those with a sadder or darker mood,
suggesting an appreciation for expressing a range of emotions through music.

Danceability (0.71646): The listener leans towards rhythm-oriented, danceable music,
implying that they enjoy moving to the beat when listening to music.

Therefore, we can characterise the listener's audio aura as energetic, balanced in emotional expression, and dance-oriented.

While averaging provides a broad understanding of the listener's music preferences,
it may need to include subtleties in their musical taste due to outliers, mood variations, and song popularity.
Extreme values can skew averages and do not account for mood changes or the impact of a listener's favourite songs.
To enhance this approach, we could consider using medians instead of averages to mitigate the effect of outliers,
perform time-based analyses to account for mood fluctuations,
and use weighted averages based on song play counts for a more representative Audio Aura.
Combining these and other methods would likely yield the best characterisation of a listener's Audio Aura.

Problem Set 7

Movies

Songsと似た課題ですが、今度は映画です。
IMDb(Internet Movie Database)のデータベースを用いて、様々な質問に回答するSQLクエリを作成します。

1.sql

SELECT title FROM movies WHERE year = 2008;

2.sql

SELECT birth FROM people WHERE name = 'Emma Stone';

3.sql

SELECT title FROM movies WHERE year >= 2018 ORDER BY title;

4.sql

SELECT COUNT(*) FROM ratings WHERE rating = 10.0;

5.sql

SELECT title, year FROM movies WHERE title LIKE 'Harry Potter%' ORDER BY year;

6.sql

SELECT AVG(rating) FROM movies JOIN ratings ON movies.id = ratings.movie_id WHERE year = 2012;

7.sql

SELECT movies.title, ratings.rating
FROM movies
JOIN ratings ON movies.id = ratings.movie_id
WHERE movies.year = 2010
ORDER BY ratings.rating DESC, movies.title;

8.sql

SELECT people.name
FROM people
JOIN stars ON people.id = stars.person_id
JOIN movies ON movies.id = stars.movie_id
WHERE movies.title = 'Toy Story';

9.sql

SELECT people.name
FROM people
JOIN stars ON people.id = stars.person_id
JOIN movies ON movies.id = stars.movie_id
WHERE movies.year = 2004
AND people.birth IS NOT NULL
ORDER BY people.birth;

10.sql

SELECT DISTINCT people.name
FROM people
JOIN directors ON people.id = directors.person_id
JOIN movies ON movies.id = directors.movie_id
JOIN ratings ON movies.id = ratings.movie_id
WHERE ratings.rating >= 9.0;

11.sql

SELECT movies.title
FROM movies
JOIN stars ON movies.id = stars.movie_id
JOIN people ON people.id = stars.person_id
JOIN ratings ON ratings.movie_id = movies.id
WHERE people.name = 'Chadwick Boseman'
ORDER BY ratings.rating DESC
LIMIT 5;

12.sql

SELECT movies.title
FROM movies
JOIN stars AS s1 ON movies.id = s1.movie_id
JOIN people AS p1 ON p1.id = s1.person_id
JOIN stars AS s2 ON movies.id = s2.movie_id
JOIN people AS p2 ON p2.id = s2.person_id
WHERE p1.name = 'Johnny Depp' AND p2.name = 'Helena Bonham Carter';

13.sql

SELECT DISTINCT people.name
FROM people
JOIN stars ON stars.person_id = people.id
JOIN movies ON movies.id = stars.movie_id
JOIN stars AS s2 ON s2.movie_id = movies.id
JOIN people AS p2 ON p2.id = s2.person_id
WHERE p2.name = 'Kevin Bacon' AND p2.birth = 1958 AND people.name != 'Kevin Bacon';

Fiftyville

CS50のマスコットであるDuckが何者かにより盗まれてしまいました。
目撃情報、電話履歴、フライトの情報などのデータを駆使して、犯人を割り出す…という、今までは異なる趣をもった課題です。

泥棒は誰か、どの都市に逃げたのか、誰が幇助したのか。
この3点に対する回答を導き出すことが目的です。

解決までの大まかな手順は下記です。

  1. .schemaを用いて、fiftyville.dbの構造を確認する

  2. crime_scene_reportsから目撃証言を確認する

  3. 場所や日時の情報を得ることができるので、条件を絞りながら各テーブルの内容を確認する

  4. 謎が解けたら、answers.txtに回答を記入する

  5. これまで行ったことをlog.sqlに記入する

以下は実際に私が提出したファイルです。

log.sql

-- Keep a log of any SQL queries you execute as you solve the mystery.

-- Check column structure and relevance of all tables
sqlite> .schema

sqlite> SELECT description
FROM crime_scene_reports
WHERE month = 7
AND day = 28
AND street = "Humphrey Street";
description
Theft of the CS50 duck took place at 10:15am at the Humphery Street bakery.
Interviews were conducted today with three witness who were present at the time - each of their interview transcripts mentions the bakery.
Littering took place at 16:36. No known witness.

-- Retrieve all information from interviews table for July 28, 2021, that mention the bakery
-- Purpose: Gathering comprehensive details from interviews related to the theft at the Humphrey Street bakery.
SELECT *
FROM interviews
WHERE year = 2021
  AND month = 7
  AND day = 28
  AND transcript LIKE '%bakery%';
id | name | year | month | day | transcript
161 | Ruth | 2021 | 7 | 28 |
Sometime within ten minutes of the theft, I saw the thief get into a car in the bakery parking lot and drive away.
If you have security footage from the bakery parking lot, you might want to look for cars that left the parking lot in that time frame.
162 | Eugene | 2021 | 7 | 28 |
I don't know the thief's name, but it was someone I recognized.
Earlier this morning, before I arrived at Emma's bakery, I was walking by the ATM on Leggett Street and saw the thief there withdrawing some money.
163 | Raymond | 2021 | 7 | 28 |
As the thief was leaving the bakery, they called someone who talked to them for less than a minute.
In the call, I heard the thief say that they were planning to take the earliest flight out of Fiftyville tomorrow. The thief then asked the person on the other end of the phone to purchase the flight ticket.

-- Retrieve the names of individuals whose license plates were recorded in the bakery security logs on July 28, 2021, between 10:15 and 10:25
-- Purpose: Identifying the potential suspects' names based on their vehicles' license plates near the bakery during the relevant time frame.
SELECT name
FROM people
WHERE people.license_plate IN (
    SELECT license_plate
    FROM bakery_security_logs
    WHERE year = 2021 AND month = 7 AND day = 28 AND hour = 10 AND minute > 15 AND minute < 25
);
name
Vanessa
Barry
Iman
Sofia
Luca
Diana
Kelsey
Bruce

-- Retrieve the names of individuals associated with bank accounts that have withdrawal transactions on July 28, 2021, at the Fifer Street ATM
-- Purpose: Identifying individuals who made cash withdrawals around the time of the crime and obtaining their names.

SELECT name
FROM people
WHERE people.id IN (
    SELECT person_id
    FROM bank_accounts
    JOIN atm_transactions ON atm_transactions.account_number = bank_accounts.account_number
    WHERE atm_transactions.year = 2021 AND atm_transactions.month = 7 AND atm_transactions.day = 28
    AND transaction_type = "withdraw"
    AND atm_transactions.atm_location = "Leggett Street"
);

name
Kenny
Iman
Benista
Taylor
Brooke
Luca
Diana
Bruce

-- Retrieve the names of individuals who were callers involved in phone calls on July 28, 2021, with a call duration less than 60 seconds
-- Purpose: Identifying individuals who made short phone calls on the day of the crime and obtaining their names.
SELECT name
FROM people
WHERE people.phone_number IN (
    SELECT caller
    FROM phone_calls
    WHERE year = 2021 AND month = 7 AND day = 28
    AND duration < 60
);
name
Kenny
Sofia
Benista
Taylor
Diana
Bruce
Carina

-- Retrieve the names of individuals whose passport numbers match the passengers on the earliest flight of July 29, 2021
-- Purpose: Identifying individuals who were passengers on the earliest flight the day after the crime and obtaining their names.
SELECT name
FROM people
WHERE people.passport_number IN (
    SELECT passport_number
    FROM passengers
    WHERE flight_id IN (
        SELECT id
        FROM flights
        WHERE year = 2021 AND month = 7 AND day = 29
        ORDER BY hour, minute ASC
        LIMIT 1
    )
);
name
Kenny
Sofia
Taylor
Luca
Kelsey
Edward
Bruce
Doris

-- Based on the investigation, Bruce must be the thief as he matches all the criteria.

-- Get the city name where Bruce fled to
SELECT city
FROM airports
WHERE id IN (
    SELECT destination_airport_id
    FROM flights
    WHERE year = 2021 AND month = 7 AND day = 29
    ORDER BY hour, minute ASC
    LIMIT 1
);
city
New York City

-- Get the accomplice's name
SELECT name
FROM people
WHERE phone_number IN (
    SELECT receiver
    FROM phone_calls
    WHERE year = 2021 AND month = 7 AND day = 28
    AND caller = (
        SELECT phone_number
        FROM people
        WHERE name = "Bruce"
    )
    AND duration < 60
);
name
Robin

answers.txt

The THIEF is: Bruce
The city the thief ESCAPED TO: New York City
The ACCOMPLICE is: Robin

さいごに

おもしろい課題が揃っているCS50の中でも、Week7は特に粒ぞろいです。
多くの方が同じように感じているかもしれませんが、Fiftyvilleが私の一番のお気に入りです。この課題を考えた人はマジ天才だと思います。
こういう発想ができる人間に、私もなりたいです。

この記事が気に入ったらサポートをしてみませんか?