Snuze

Persistence: Storing and Retrieving Reddit Data

This is a draft. The information here is accurate, but not yet complete. Estimated completion for this document is Friday, 2019-09-20.

For now, some rough but working MySQL setup steps are below...

Overview

Snuze supports optionally persisting Reddit data through the concept of storage providers. A storage provider is a set of PHP classes that tell Snuze how to interact with some data storage facility. Two storage providers are built in:

The SQLite storage provider

A very basic SQLite storage provider is included with Snuze. It requires that the pdo_sqlite extension is available; this is enabled by default in modern PHP installations.

All the SQLite storage provider knows how to do is save OAuth access tokens so they can be re-used until they expire. This reduces the number of login requests your script sends to Reddit's server, making it more polite (and a bit more efficient). While lacking in features, the advantage to the SQLite provider is that it requires no database server and no setup. Access tokens are stored in a small data file.

SQLite storage is enabled by default in Snuze. To enable it explicitly, or to change the path of the SQLite data file, use these options when configuring:

       'storage.enabled'    => true,
       'storage.namespace'  => '\snuze\Persistence\SQLite',
       'storage.parameters' => ['filename' => 'snuze.sqlite'],

To disable persistence altogether, set storage.enabled to false instead:

       'storage.enabled'    => false,

Be advised that completely disabling persistence means your script will have to go and obtain a new access token every time it runs.

The MySQL storage provider

Overview

Snuze comes with a MySQL storage provider, capable of persisting most entities it can retrieve from the API.

The Snuze MySQL schema doesn't define foreign keys or enforce referential integrity. This is an intentional design decision. Some users may want to spider and save links only, for example, without being forced to fetch and store the subreddits and submitters' accounts as well.

Setup

Rough notes. Use these steps to set up the MySQL storage provider.

  1. Create a database and a user for Snuze, and give the user full access to the database.
[user@host ~/mybot]$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 44861
Server version: 5.7.27-log MySQL Community Server (GPL)

mysql> create database snuze;
Query OK, 1 row affected (0.01 sec)

mysql> grant all privileges on snuze.* to 'snuzer'@'localhost' identified by 'blah';
Query OK, 0 rows affected, 1 warning (0.01 sec)
  1. From the directory where you composer require'd Snuze, run the bundled MySQL database utility. This will install the Snuze MySQL database schema.
[user@host ~/mybot]$ php vendor/snuze/snuze/src/Persistence/MySQL/utility/utility.php

This is the Snuze MySQL storage provider database utility. Please follow along
with the prompts to upgrade or install your Snuze MySQL database.

Enter MySQL hostname or IP address: localhost
Enter MySQL username: snuzer
Enter MySQL password: blah
Enter MySQL database name (must exist already): snuze

Connecting to MySQL...
No existing schema was found. The tables need to be created.

Apply these changes now? [yes/no]: yes
Schema installation is complete!
  1. Tweak your Snuze configuration array so it uses the MySQL storage provider namespace. For the storage parameters, supply an array containing the MySQL credentials:
<?php
require_once __DIR__ . '/vendor/autoload.php';

/* Set authentication and configuration parameters */
$config = [
   'auth.client_id'     => 'your-client-id-from-reddit',
   'auth.client_secret' => 'your-secret-from-reddit',
   'auth.username'      => 'your-reddit-username',
   'auth.password'      => 'your-reddit-password',
   'auth.user_agent'    => 'php:YourAwesomeBot 1.2.3 (by /u/your-reddit-username)',
   /* This turns on MySQL support! */
   'storage.namespace'  => '\snuze\Persistence\MySQL',
   'storage.parameters' => [
       'server'   => 'localhost',
       'username' => 'snuzer',
       'password' => 'blah',
       'database' => 'snuze'
   ],
];

/* Pass the config array to a SnuzeFactory to get a Snuze object */
$snuze = (new \snuze\SnuzeFactory($config))->getSnuze();

Now you can start saving the data you retrieve from the API:

$sub = $snuze->fetchSubreddit('memphis');
$snuze->getSubredditMapper()->persist($sub);

foreach ($snuze->fetchLinksRising('memphis', 10) as $link) {
   $snuze->getLinkMapper()->persist($link);
}

Check MySQL and you should have information about /r/memphis in the subreddits table, and the top 10 "rising" links from it in your links table:

[user@host ~/mybot]$ mysql -u snuzer -p snuze
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 44861
Server version: 5.7.27-log MySQL Community Server (GPL)

mysql> select id, from_unixtime(created), subscribers, public_description from subreddits;
+-------+------------------------+-------------+--------------------------------------+
| id    | from_unixtime(created) | subscribers | public_description                   |
+-------+------------------------+-------------+--------------------------------------+
| 2qpqw | 2008-12-11 02:55:12    |       14181 | The Reddit community of Memphis, TN. |
+-------+------------------------+-------------+--------------------------------------+
1 row in set (0.00 sec)

mysql> select from_unixtime(created) posted, author, title from links order by created desc;
+---------------------+---------------------+-----------------------------------------------------------------------------------------------+
| posted              | author              | title                                                                                         |
+---------------------+---------------------+-----------------------------------------------------------------------------------------------+
| 2019-08-21 07:07:55 | Boneyabba           | neighborhood watches                                                                          |
| 2019-08-21 03:57:14 | Imallvol7           | One Beale breaking ground downtown Tuesday | WREG.com                                         |
| 2019-08-21 02:57:05 | BigChree2407        | Crosstown EV Charging                                                                         |
| 2019-08-21 02:00:52 | GritsandGrinds901   | This Giraffe clearly remembers the time I dropped $5 twelve times to feed it.                 |
| 2019-08-21 01:20:12 | BandidoCoyote       | RIP Mall of Memphis                                                                           |
| 2019-08-20 23:28:29 | RaceMIR             | Ground Pounding Pro Mods Invade Memphis International Raceway September 13-14                 |
| 2019-08-20 21:49:47 | 24fps_              | Kroger removing free publications?                                                            |
| 2019-08-20 20:01:19 | im-just-here-       | Best coffee shop in Memphis?                                                                  |
| 2019-08-20 19:37:01 | NonconnahChupacabra | FBI stops shooter planning on attacking a Memphis church                                      |
| 2019-08-20 19:01:39 | HellooNewmann       | 7th grade boy donates $15K from County Fair winnings to St. Jude Childrenâ–’s Research Hospital |
+---------------------+---------------------+-----------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)