Hands-on session: Key-Value and Document Data Store (NoSQL) ################################################################################ # Part 1: Redis (Key-Value Data Store) ################################################################################ 1. Let's retrieve the Docker image with pre-installed Redis docker pull sickp/alpine-redis (we are using a lightweight container; the official one can be obtained with "docker pull redis") 2. We can now create an isolated network with a Redis server docker network create redis_network docker run --rm --network=redis_network --name=redis-server sickp/alpine-redis (to persiste the Redis storage we could use a directory outside the container, using a persistent storage: -v [localdir]:[container dir /data]) 3. In another shell we can create a Redis client, connected to the same network docker run --rm --net=redis_network -it sickp/alpine-redis redis-cli -h redis-server 4. We are ready to use Redis. We consider as use case an web application that acts as online music player, so we present the APIs considering different needs of this web application. ======================================== Basic operations ----------------- Key-Value: Strings ------- GET userId1 SET userId1 matteo GET userId1 APPEND userId1 nardelli GET userId1 EXISTS userId1 EXISTS userId2 SETNX usedId2 giovanni SETNX usedId1 giovanni DEL userId1 EXISTS userId1 EXISTS userId2 EXPIRE userId2 3 EXISTS userId2 ----------------- Key-Value: Hashes (of strings) ------- # We need to implement a reccomandation system for dynamically propose a radio station, that suggests the next song according to the history of played songs per genre. To this end, we need to keep trace of a counter for each genre played by the user. We consider to store a userXcounter for each user and, for each userXcounter, we use an hashmap that associates a counter to each genre played. HSET user1counter rock 1 HGET user1counter rock HEXISTS user1counter classic HGET user1counter classic HSET user1counter classic 1 HSET user1counter rock 4 HSET user1counter jazz 2 HSET user1counter pop 1 HEXISTS user1counter classic HDEL user1counter HEXISTS user1counter classic HKEYS user1counter HVALS user1counter ----------------- Key-Value: Sets (of strings) ------- # We also need to retrieve bands or singers that play that musical genre. We can rely on the data store to memorize bands/singers per each musical genre. We assume that a band can play several genres. We might be interested in selecting bands belonging to multiple genres, or in identifying a selection of bands that play the same kind of music. SADD rock "pink floyd" SADD rock "queen" SADD rock "springsteen" SADD rock "baustelle" SADD jazz "paolo conte" SADD pop "paolo conte" SADD pop "baustelle" SCARD rock SCARD Rock SADD pop "mozart" SREM pop "mozart" SDIFF rock pop SUNION rock jazz SINTER rock pop ----------------- Key-Value: Sorted Sets (of strings) ------- # The recommendation system might learn from the user behavior upon the suggested songs. Therefore, we need to identify the number of reproduction of the suggested genre, so that, in the future, we can suggest the top-K genres that have been suggested and listened by the user. We can use the sorted sets to store the number of reproduction of songs per genre, so that the data structure can automatically determines the top-K elements. ZCARD urepr ZADD urepr 1 rock ZADD urepr 1 jazz ZADD urepr 1 pop ZCARD urepr ZREM urepr pop ZCARD urepr ZRANK urepr pop ZRANK urepr rock ZINCRBY urepr 3 rock ZINCRBY urepr 1 pop ZRANK urepr pop ZRANK urepr rock ZRANGE urepr 0 1 ZRANGE urepr 0 -1 ZRANGE urepr 0 -2 ZRANGEBYSCORE urepr 3 10 ----------------- Key-Value: Lists (of strings) ------- # The only music player needs to store the playlist for the user, which can be populated by the user or by the recommendation system. RPUSH uplay "time" RPUSH uplay "money" LPUSH uplay "glory days" LLEN uplay LRANGE uplay 0 -1 LRANGE uplay -2 -1 ??? LPOP uplay LLEN uplay LREM uplay "glory days" 0 # overrides data or out of range: LSET uplay 1 "we will rock you" LRANGE uplay 0 -1 ################################################################################ # Part 2: MongoDB (Document Data Store) ################################################################################ 1. Let's retrieve the Docker image with pre-installed MongoDB (we use the official container) docker pull mongo 2. We can now create an isolated network with a MongoDB server docker network create mongonet docker run -i -t -p 27017:27017 --name mongo_server --network=mongonet mongo:latest /usr/bin/mongod --bind_ip_all 3. In another shell we can create a Mongo client, connected to the same network docker run -i -t --name mongo_cli --network=mongonet mongo:latest /bin/bash 4. On the client, we open the command line interface "mongo", specifying the address of the server mongosh mongo_server:27017 5. We are ready to use MongoDB. We consider as use case an content management system, which needs to manage posts, images, videos, and so on, each with its own specific attributes; for example: { name : "hello world", type : "post", size : 250, comments : ["c1", "c2"] } { name : "sunny day", type : "image", size : 3, url : "abc" } { name : "tutorial", type : "video", length : 125, path : "/video.flv", metadata : {quality : "480p", color : "b/n", private : false } } ======================================== Basic operations ----------------- # Create and switch to a new database named "cms" use cms # Insert in a collection named "cms" several documents (a post, an image, and a video) db.cms.insert({ name : "hello world", type : "post", size : 250, comments : ["c1", "c2"] } ) db.cms.insert({ name : "sunny day", type : "image", size : 300, url : "abc" }) db.cms.insert({ name : "tutorial", type : "video", length : 125, path : "/video.flv", metadata : {quality : "480p", color : "b/n", private : false } }) # Querying the database to find documents db.cms.find() db.cms.find({type : "image" } ) db.cms.insert({ name : "mycms-logo", type : "image", size : 3, url : "cms-logo.jpg" }) db.cms.find({type : "image" } ) db.cms.find( {size : { $gt : 100 } } ) db.cms.find( {size : { $lt : 100 } } ) db.cms.find({ length : { $exists: true } } ) db.cms.find({ comments : { $exists: true } } ) db.cms.find({ "comments.2" : { $exists: true } } ) db.cms.find({ "comments.1" : { $exists: true } } ) db.cms.find({ "metadata.quality" : { $exists: true } } ) db.cms.find({ "metadata.quality" : "360p" } ) db.cms.find({ "metadata.quality" : "480p" } ) db.cms.find({ "metadata.private" : false } ) db.cms.findOne({ comments : { $exists: true } } ) db.cms.findOne({ comments : { $exists: true } } ).comments[1] # A slightly more complex query, that combines multiple conditions in a logical AND or OR condition: AND: db.cms.find( {size : { $gt : 100 } , type : "image" } ) OR: db.cms.find( { $or : [ { size : { $gt : 100 } } , { type : "image" } ] } ) # Sort results in ascending (value = 1) or descending order (value = -1) db.cms.find().sort( { name : 1 } ) db.cms.find({type : "image" }).sort( { name : -1 } ) db.cms.find().sort( { nonexistingfield : 1 } ) # Update a document: there are several way to update a document; for example we can simply update a subset of fields using the operator $set: db.cms.update({ "name" : "Canon EOS 750D" }, { $set: { "address.street": "East 31st Street" } } ) # Update a document: the previous command changes only the first occurrence; if the update should be performed on multiple occurrences, the parameter "multi" should be used: db.cms.update({ "name" : "mycms-logo" }, { $set: { "metadata.quality": "hd" } } ) db.cms.find({name : "mycms-logo"}) db.cms.find({type : "image"}) db.cms.update({ type : "image" }, { $set: { "metadata.author": "myname" } } , {multi: true} ) db.cms.find({type : "image"}) # Update a document: how to add a new element to an array (using the operator $push): db.cms.find({ type : "post" } ) db.cms.update({ name : "hello world" }, { $push: { "comments": "a new comment" } } ) db.cms.update({ name : "hello world" }, { $push: { "comments": "a second new comment" } } ) db.cms.find({ type : "post" } ) # Update a document: override the whole document db.cms.find() db.cms.update({ name : "mycms-logo" }, { author : "myname" } ) db.cms.find() # Remove documents. By default .remove() removes all the documents that match the document fields passed as argument. If a single delete should be performed, we need to use the parameter "justOne". db.cms.remove( { type : "video" } ) db.cms.remove( { author : "myname" } , { justOne: true } ) # the following removes all documents db.cms.remove( { } ) # We can also drop the whole collection (together with its documents) db.cms.drop()