Doing Something Useful With My IMDb Data Set

13 Nov 2019

Table of Contents


Background

Recently, I took a set of IMDb data, re-arranged it, and loaded it into a relational schema.  You can read about that here: Using IMDb as a Test Data Set.

Interesting, perhaps. But not especially useful.

So I created a web site to display a part of that data:

  • Titles - movies, TV shows, etc.
  • Talent - actors, directors, producers, etc.
  • Credits - just the top few actors, and other main crew details. 
  • Filmography - the movies (and TV shows) for a given talent.

The web site itself uses various technologies I wanted to take a closer look at, such as:

  • Javalin web framework.
  • Embedded Jetty web server.
  • Thymeleaf templating engine.
  • Jdbi relational data access.
  • Hikari database connection pooling.
  • Hibernate Validator data validation.
  • Gson JSON serialization/deserialization.
  • jQuery UI user interface widgets.
  • Datatables tabular data display.
  • Skeleton responsive UI/layout boilerplate.

Further down the road there are also questions about how to handle larger volumes of data.  The full IMDb data set has over 6 million titles.  I would not attempt to display even a small fraction of that volume in a list.  We will return to this subject later.

I will take a look at the above technologies in some follow-on posts.  For now, I want to just present the web site itself, and walk through how you can set it up for yourself, if you want to.

Screenshots

Here are some screenshots. 

Title listing page:

Title record maintenance page:

Talent in Title (cast and crew) page:

There’s not much more to the user interface than that - but it’s sufficient to investigate several technologies in a reasonably realistic way.

Set-Up

To run the web site, take following steps:

Prerequisites

You need a recent version of Java.  The application is built using Maven - so you need that also. Installing those is outside the scope of this article - but it should be straightforward and is well documented elsewhere.

Download the code

Download a zip file of the repository from GitHub:

https://github.com/northcoder-repo/showcase-basic-web/archive/master.zip

Unzip the file, and (optionally) rename it to TitleWebDemo.

(Or you can choose to clone the repository, if you prefer.)

Start the Database

The demo comes bundled with a small H2 database.

At the command line, move into the h2 directory.

There are versions scripts for Windows (.bat) and Linux (.sh).  You only need to run the relevant 01_start_h2_server script.

On Linux you may need to make the script executable before you can run it:

chmod u+x 01_start_h2_server.sh

There are other scripts in the h2 directory which show how the h2 server was created and how the database was populated with data.  You can take a look at these, but you don’t need to run them unless you want to make changes.  The database has already been created.

If the database starts correctly, you will see output like this:

$> ./01_start_h2_server.sh  

TCP server running at tcp://172.31.22.154:9092 (only local connections)  
PG server running at pg://172.31.22.154:5435 (only local connections)  
Web Console server running at http://172.31.22.154:8082 (only local connections)  

Failed to start a browser to open the URL http://172.31.22.154:8082: Browser detection failed, and java property 'h2.bro      wser' and environment variable BROWSER are not set to a browser executable.

In the above case, the browser error occurred because I ran the command on a linux server without a GUI installed. This can be ignored.

When you are ready to stop the database, type Ctrl-C.  

Build the Web Application

The git repository does not include the application JAR file, so we have to build it now.

Open a new command window and go to your TitleWebDemo directory.  Run the 01_clean_and_build script (as above, you may need to make it executable first).

Upon successful completion, you should see output similar to this:

[INFO] ------------------------------------------------------------------------  
[INFO] **BUILD SUCCESS**  
[INFO] ------------------------------------------------------------------------  
[INFO] Total time: 18.538 s  
[INFO] Finished at: 2019-11-13  
[INFO] Final Memory: 28M/100M  
[INFO] ------------------------------------------------------------------------

You will also see the new TitleWebDemo.jar file in the target directory.

Start the Web Application

Run the 02_run_titlewebdemo script. This starts the Javalin web server.  You should see output similar to this:

2019-11-13 [INFO ] [main] Javalin -  
           __                      __ _  
          / /____ _ _   __ ____ _ / /(_)____  
     __  / // __ `/| | / // __ `// // // __ \  
    / /_/ // /_/ / | |/ // /_/ // // // / / /  
    \____/ \__,_/  |___/ \__,_//_//_//_/ /_/  

        https://javalin.io/documentation  

2019-11-13 [INFO ] [main] Javalin - Starting Javalin ...  
2019-11-13 [INFO ] [main] Javalin - Listening on http://localhost:7080/  
2019-11-13 [INFO ] [main] Javalin - Listening on https://localhost:7443/  
2019-11-13 [INFO ] [main] Javalin - Javalin started in 1094ms \o/  
2019-11-13 [INFO ] [main] HikariDataSource - HikariPool-1 - Starting...  
2019-11-13 [INFO ] [main] HikariDataSource - HikariPool-1 - Start completed.

Access the Web Application

You can now open the web application in a browser (use https not http):

https://localhost:7443/titles

WARNING: Because the application uses a self-signed certificate, your browser will probably throw a security warning, before it lets you continue to the site.  It’s OK (in this case) to continue to the site.

You should see the title listing, as shown in the screenshot earlier in this article.

When You Have Finished

Remember to shut down the web application and the database at the end.  In both cases, go to the windows where you started those programs and type Ctrl-C.

Wrap-Up

Limitations

The demo web site is not a fully functioning CRUD application.  For example:

  • You can update records but you cannot add or delete records.
  • There is no login mechanism - and, consequently, no access control logic

What's Next?

A series of posts describing various aspects of the web site technology in more detail, including: