Apprentices' Guild's Journal
Written by Hansibaba on Tue, Oct 3, 2023 4:04:26 PM and viewed 584 times
PHP Certificate - Initialization and Batch Insertion [Days 1 & 2]
Hey y'all, first blog post for my PHP Application Specialist certificate. Developed the concepts of Initialization and batch insertion so far. Initialization in this context refers to the creation of the database and table with the table having the pertinent columns included; batch insertion refers to the insertion of multiple rows of data -a batch of data- all-at-once using a single statement. Worked hard on these for about 2 days in full, hope y'all enjoy the blog.
Day 1
First day just got the basic initialization functioning. I'm updating this after-the-fact so the code provided is the debugged and functioning version of it developed after the second day of development. For this certificate I needed to create a database with over half a million entries. For each entry I wanted to give a little more than just an ID with a number so we have the amazingly intimate details of a random day of the week, miles walked, food, and a number from 0 to 100 (personalized to "favorite number"); glad these ID's can be so opinionated. The code for initializing this structure is quite simple although it's made significantly larger due to the implemented security measures against SQL injection.
<?php
//// INITIALIZE DATABASE AND TABLE : ESTABLISH CONNECTION TO DATABASE //
// CREATE VARIABLES
$servername = "localhost";
$username = "hansibaba";
$password = "test";
$database = "php_db";
$table = "php_table";
$favoritefoodenum = "ENUM('Pizza','IceCream','Sushi','Hamburger','Sandwhich','Apple','Pasta','Burrito')";
$favoritedayenum = "ENUM('Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday')";
$favoritefoodarray = ['Pizza','IceCream','Sushi','Hamburger','Sandwhich','Apple','Pasta','Burrito'];
$favoritedayarray = ['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'];
// START SESSION TO PROVIDE VARIABLES TO OTHER SCRIPTS
session_start();
$_SESSION['favoritefoodarray'] = $favoritefoodarray;
$_SESSION['favoritedayarray'] = $favoritedayarray;
// INITIALIZE DATABASE
$serverconn = new mysqli($servername,$username,$password);
$sql = "CREATE DATABASE IF NOT EXISTS " . $database;
mysqli_query($serverconn,$sql);
mysqli_close($serverconn);
// INITIALIZE TABLE
$conn = new mysqli($servername,$username,$password,$database);
$_SESSION['conn'] = $conn;
$sql = "CREATE TABLE IF NOT EXISTS " . $table . " (
ID INT PRIMARY KEY,
MILESWALKED DECIMAL,
FAVORITEFOOD " . $favoritefoodenum . ",
FAVORITEDAY " . $favoritedayenum . ",
FAVORITENUMBER INT
)";
mysqli_query($conn,$sql); //create table
?>
This initialization code first creates the variables needed for the database and defines its name and other wonderfully boring details. It then uses this information to first create a connection to phpmyadmin (the database server which utilizes mariadb), then to create a database on this server. From here it creates a connection that includes a connection to the new-born database in order to create a table with our information. Minus all of the computer non-sense, this connects to the place with the numbers then makes a place for more numbers and words.
When I first started this project, I had (less informedly) attempted to do all of this inverted; I attempted to create PHP code with HTML when, in fact, this is the exact opposite of how these two are supposed to be used together. Here is that lovely attempt in all of its glory:
<?php
class SQLHandler
{
private $hostname = 'localhost';
private $username = 'hansibaba';
private $password = 'test';
public $database = 'php_db';
private $table = 'php_table';
public $connection;
public function __construct()
{
$this->connect_sql();
}
public function search_by($input,$column)
{
$input = mysqli_real_escape_string($this->connection,$input);
$query = 'MILESWALKED,FAVORITEFOOD,FAVORITEDAYOFTHEWEEK,FAVORITENUMBER FROM' . $this->database . '.' . $this->table . 'WHERE ' . $column . ' = ?';
$statement = $this->prepare_query($query);
mysqli_stmt_bind_param($statement,'s',$input);
mysqli_stmt_execute($statement);
$result = mysqli_stmt_get_result($statement);
mysqli_stmt_close($statement);
$rows = [];
while($row == mysqli_fetch_assoc($result))
{
$rows[] = $row;
}
return $rows;
}
//Create MYSQL connection
public function connect_sql()
{
$serverConn = mysqli_connect($this->hostname, $this->username, $this->password);
mysqli_query($serverConn,"
CREATE DATABASE IF NOT EXISTS php_db;
");
$this->connection = mysqli_connect($this->hostname, $this->username, $this->password,$this->database);
if(!$this->connection)
{
die('Connection Failed : ' . mysqli_connect_error());
}
//echo "Connection Established <br>";
}
//Close MYSQL connection
public function close_sql()
{
mysqli_close($this->connection);
}
//Execute a provided query via connection
public function execute_query($query)
{
$result = mysqli_query($this->connection,$query);
return $result;
}
public function prepare_query($connection,$query)
{
$result = mysqli_prepare($this->connection,$query);
return $result;
}
//Start up querries for initializing the database
public function build_database()
{
//Create table
$this->execute_query("
CREATE TABLE IF NOT EXISTS " . $this->database . "." . $this->table ."(
ID INT PRIMARY KEY AUTO_INCREMENT,
MILESWALKED DECIMAL,
FAVORITEFOOD ENUM('Pizza','Hamburger','Grapes','Sushi','Apples','Burritos','Tacos','IceCream'),
FAVORITEDAYOFTHEWEEK ENUM('Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'),
FAVORITENUMBER INT
);
");
$this->execute_query("
CREATE INDEX IF NOT EXISTS idx_favorite_food ON " . $this->database . "." . $this->table ."(FAVORITEFOOD);
");
$this->execute_query("
CREATE INDEX IF NOT EXISTS idx_favorite_day ON " . $this->database . "." . $this->table ."(FAVORITEDAYOFTHEWEEK);
");
//echo "Database Constructed <br>";
}
public function return_number_of_row_in_db()
{
$NumOfRowsInDatabase = 0;
$RawRowData = $this->execute_query("SELECT COUNT(ID) AS rowcount FROM " . $this->database . "." . $this->table . ";");
if ($RawRowData != null)
{
$assoc_array = mysqli_fetch_assoc($RawRowData);
$NumOfRowsInDatabase = (int)$assoc_array['rowcount'];
}
return $NumOfRowsInDatabase;
}
//Populate tables with entries
public function populate_table($MaxRows)
{
$NumOfRowsInDatabase = $this->return_number_of_row_in_db();
if ($NumOfRowsInDatabase < $MaxRows)
{
for ($i=$NumOfRowsInDatabase; $i < $MaxRows; $i++)
{
$mileswalked = rand(0,1000);
$favoritefood = array('Pizza', 'Hamburger', 'Grapes', 'Sushi', 'Apples', 'Burritos', 'Tacos', 'IceCream');
$favoritedayoftheweek = array('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday');
$favoritenumber = rand(0,100);
$statement = $this->prepare_query
($this->connection,"
INSERT INTO " . $database ."." . $table . "
(
MILESWALKED,
FAVORITEFOOD,
FAVORITEDAYOFTHEWEEK,
FAVORITENUMBER
)VALUES(
?,
?,
?,
?
);
");
mysqli_stmt_bind_param($statement,"dssi",$mileswalked,$favoritefood[array_rand($favoritefood)],$favoritedayoftheweek[array_rand($favoritedayoftheweek)],$favoritenumber);
mysqli_stmt_execute($statement);
mysqli_stmt_close($statement);
}
}
}
}
class PageHandler
{
public $sqlHandler;
private $connection;
private $entries;
//SETUP WEBPAGE
public function __construct($sqlHandler)
{
$rows = $sqlHandler->return_number_of_row_in_db();
$this->sqlHandler = $sqlHandler;
$this->connection = $sqlHandler->connection;
echo
"
<!DOCTYPE html>
<html>
<head>
<title>PHP DATABASE </title>
</head>
<body>
<h1>PHP DATABASE</h1>
<h3>Number of Entries in Database : " . $rows . "</h3>
<form method='POST'>
<label for='searchtype'>Search by </label>
<select id='searchtype' name='searchtype' style='margin-bottom:10px'>
<option value = '0'>ID</option>
<option value = '1'>food</option>
<option value = '2'>day</option>
<option value = '3'>miles walked</option>
<option value = '4'>number</option>
</select>
<input type='submit' name='submit' value='searchtype'/>
</form>
<br>
" . $this->display_info() . "
</body>
</html>
";
}
private function display_info()
{
$selectedtype = 0;
if($_SERVER['REQUEST_METHOD'] == "POST")
{
if(isset($_POST['submit']))
{
if(isset($_POST["searchtype"]))
{
$selectedtype = $_POST["searchtype"];
}else{
echo "search criteria not detected";
}
switch($selectedtype)
{
case 0:
$this->display_rows_by_id();
break;
case 1:
break;
case 2:
break;
case 3:
break;
case 4:
break;
}
}
}
}
private function return_bounds()
{
if($_SERVER["REQUEST_METHOD"] == "POST")
{
try
{
$rowsToShow = $_POST["rowsToShow"];
$stringArray = preg_split("/\D+/",$rowsToShow);
$upperBound = (int)$stringArray[1];
$lowerBound = (int)$stringArray[0];
//FLIP UPPER AND LOWER BOUND IF UPPER BOUND IS LESS THAN LOWER BOUND
if ($upperBound < $lowerBound)
{
[$upperBound,$lowerBound] = [$lowerBound,$upperBound];
}
$bounds = array($lowerBound,$upperBound);
return $bounds;
} catch (Exception $e) {
echo "Error in retrieving form information. Error : " . $e->getMessage();
}
}
}
private function return_rows_within_range($lowerbound,$upperbound)
{
$query = "SELECT MILESWALKED,FAVORITEFOOD,FAVORITEDAYOFTHEWEEK,FAVORITENUMBER FROM " . $database . "." . $table . " WHERE ID BETWEEN ? AND ?";
$statement = $this->sqlHandler->prepare_query($this->connection,$query);
mysqli_stmt_bind_param($statement,"ii",$lowerbound,$upperbound);
mysqli_stmt_execute($statement);
$result = mysqli_stmt_get_result($statement);
mysqli_stmt_close($statement);
$rows = [];
while($row = mysqli_fetch_assoc($result))
{
$rows[] = $row;
}
return $rows;
}
private function display_rows_by_id()
{
$db = $this->sqlHandler->database;
echo "
<table border = '10'>
<tr>
<th>ID</th>
<th>Miles Walked</th>
<th>Favorite Food</th>
<th>Favorite Day of the Week</th>
<th>Favorite Number</th>
</tr>
";
}
private function display_selected_rows()
{
// DISPLAY HEADING //////////////////////////////////////////////////////////////
$bounds = $this->return_bounds();
if($bounds !== null)
{
$lowerBound = $bounds[0];
$upperBound = $bounds[1];
echo "From $lowerBound to $upperBound";
} else {
die("No rows selected");
}
// DISPLAY ROWS //////////////////////////////////////////////////////////////
$rows = $this->return_rows_within_range($lowerBound,$upperBound);
if(!empty($rows))
{
$columnNames = array_keys($rows[0]);
echo "<br>";
foreach($columnNames as $key)
{
echo $key . ' | ';
}
foreach($rows as $row)
{
echo "<br>" . $row["ID"] . ' | ' . $row["MILESWALKED"] . ' | ' . $row["FAVORITEFOOD"] . ' | ' . $row["FAVORITEDAYOFTHEWEEK"] . ' | ' . $row["FAVORITENUMBER"] . ' | ';
}
}
}
}
//CREATE AND POPULATE DATABASE UP TO SPECIFIED NUMBER OF ENTRIES
$entries = 510000;
$sqlHandler = new SQLHandler;
$sqlHandler->build_database();
$sqlHandler->populate_table($entries);
//CREATE DYNAMIC WEBPAGE
$pagehandler = new PageHandler($sqlHandler);
?>
If you read all of that (and presumably attempted to understand), I'm sorry. The reason for this format is now only between the version of me that was making it at the time, and god. This code *did* work although certainly as an affront to programmers everywhere from now and throughout the future. This code does what the newer code of day 1 does and actually has methods to present and format this data. The first portion initializes the database and table through a similar methodology as the updated code; the later portions then setup the HTML (yes as one very, very large string) that actually writes each row of data individually.
The code has two classes, the SQL handler and the Page handler. The SQL handler (as a surprise to no-one) handles the SQL; it creates connections and has methods to update the database as well as to handle queries. The Page handler on the other hand(ler), handles the rendering of the page; it has all of the *lovely* HTML "code" and also makes some cool buttons for everyone's clicking necessities. This also allows the user to search through the database for specific row-to-row entries and will format an entry appropriately if (for whatever reason) the person inputting did not perfectly format their range of rows they wanted to view.
Day 2
Few days since the last day of development due to this not being the only project I am currently working on; sadly, I am not actually a crazed database-obsessed coder. For that other project, just visit the L.I.C.H. development blogs to get more of my *amazing*, *captivating*, *awe-inspiring*, code. Alright, enough self-promotion and idolization of clicking a keyboard till a machine does what I want. Day 2 got into two main things, setting up the UI and getting the population of the table to be efficient (not take too much time) and effective (ya know... actually function); gotta have it be efficient as to not take time away my blog writing of course and I suppose to also not take up too many system resources and all that jazz. With this updated version of the code, I decided to split it into several scripts as to make the code more organized and generally easier to read so I can stop everyone's eyes from bursting into flames from the mere sight of it.
<?php
//// INITIALIZE DATABASE AND TABLE : ESTABLISH CONNECTION TO DATABASE //
// CREATE VARIABLES
$servername = "localhost";
$username = "hansibaba";
$password = "test";
$database = "php_db";
$table = "php_table";
$favoritefoodenum = "ENUM('Pizza','IceCream','Sushi','Hamburger','Sandwhich','Apple','Pasta','Burrito')";
$favoritedayenum = "ENUM('Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday')";
$favoritefoodarray = ['Pizza','IceCream','Sushi','Hamburger','Sandwhich','Apple','Pasta','Burrito'];
$favoritedayarray = ['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'];
// START SESSION TO PROVIDE VARIABLES TO OTHER SCRIPTS
session_start();
$_SESSION['favoritefoodarray'] = $favoritefoodarray;
$_SESSION['favoritedayarray'] = $favoritedayarray;
// INITIALIZE DATABASE
$serverconn = new mysqli($servername,$username,$password);
$sql = "CREATE DATABASE IF NOT EXISTS " . $database;
mysqli_query($serverconn,$sql);
mysqli_close($serverconn);
// INITIALIZE TABLE
$conn = new mysqli($servername,$username,$password,$database);
$_SESSION['conn'] = $conn;
$sql = "CREATE TABLE IF NOT EXISTS " . $table . " (
ID INT PRIMARY KEY,
MILESWALKED DECIMAL,
FAVORITEFOOD " . $favoritefoodenum . ",
FAVORITEDAY " . $favoritedayenum . ",
FAVORITENUMBER INT
)";
mysqli_query($conn,$sql); //create table
?>
This first script (which I already talked about in day 1) is named PHPInitialize.php (charming name, I know). As said in day one, this handles the initial table and database creation. It also creates a session in order to provide the created variables to other scripts.
<?php
require_once('PHPInitialize.php');
function execute_sql($query,$types = NULL,...$params)
{
$conn = $_SESSION['conn'];
$stmt = mysqli_prepare($conn,$query);
if (!$stmt) {
die('Error in preparing statement: ' . mysqli_error($conn));
}
if($types !== NULL)
{
$bindparams = array_merge([$stmt,$types],$params);
$references = [];
foreach($bindparams as $key => $value)
{
if($key > 1) //cannot be 1 or 0 because the statment and types are stored there
{
$references[] = &$bindparams[$key];
}
}
call_user_func_array('mysqli_stmt_bind_param', $bindparams);
}
if(!mysqli_stmt_execute($stmt))
{
throw new Exception("Error executing query: " . mysqli_error($conn));
}
$result = mysqli_stmt_get_result($stmt);
if($result !== false)
{
return $result;
} else {
if(mysqli_affected_rows($conn) > 0)
{
return true;
}
else
{
throw new Exception("Error fetching result set:" . mysqli_error($conn));
}
}
mysqli_stmt_close($stmt);
}
function return_num_of_rows()
{
$result = execute_sql("SELECT COUNT(ID) FROM php_db.php_table");
$row = mysqli_fetch_row($result);
return (int)$row[0];
}
function randomize_entries()
{
$mileswalked = rand(0,1000);
$favoritefood = $_SESSION['favoritefoodarray'][array_rand($_SESSION['favoritefoodarray'])];
$favoritedayoftheweek = $_SESSION['favoritedayarray'][array_rand($_SESSION['favoritedayarray'])];
$favoritenumber = rand(0,100);
$_SESSION['miles'] = $mileswalked;
$_SESSION['food'] = $favoritefood;
$_SESSION['day'] = $favoritedayoftheweek;
$_SESSION['number'] = $favoritenumber;
}
?>
This beautiful lump of words and symbols is the second script which is named PHPFunctions.php (again, very clever). This script sets up several useful functions for the next script -which actually uses them. The first function, execute_sql, does what no-one thought it could do... executes sql. This particular function is less for convenience as it is for security -although it's certainly convenient. It grabs the connection from the initialization script and prepares it as a statement; this preparation ties the query to a connection and "cleans" it, making it resistant to sql injection. After a bit of error handling, it then, if provided with types -and by extension parameters-, "binds" the parameters to the statement. This binding process starts with making an array with the statement and types at the start (for a later surprise) and the parameters right on the end. It then uses a callback to finally bind all of those incredibly intimate details to the statement using the mysqli_stmt_bind_param statement. After that action-packed roller coaster, it executes the newly scrubbed-clean sql query, returns the result if there is one, and close the statement to free up machine resources.
Because I know some of you were thinking, with the up-most vigor, "But what about the other functions!?" (yes, with the ques-clamation mark), here is how the fan-favorites, return_num_of_rows and randomize_entries work. For the former, it first counts the number of entries in the ID column of the database (every entry as an ID as it's the primary key for the table) and then fetches that data as an array. From there, it casts(think of a blacksmith melting something down and casting it into a mold of something else) the data to an integer and returns that value. Voilà, we've achieved a feat un-touched by man... counting. For the latter of the two functions, it first randomizes the several data points that we will enter into the next script and then assigns those, now random, points to the session variables so that they can be accessed later. As for the variables for favorite food and days, they pull the array of possible options for those entries and picks a random possibility to assign to the session variable.
Now, for the moment you've all been waiting for... the last script! This script is likely the most complex of the 3 but I'll go through it since everyone's dying to know how it works. This is where the batch population part of this blog comes in. This script is the one that does the grand magic of database population.
<?php
require('PHPFunctions.php');
// DEFINE VARIABLES ; get number of rows, max number of desired rows, batch size, and number of batches needed to achieve the desired size
$rows_in_db = return_num_of_rows();
$max_rows = 5000000;
$batch_size = 400;
$rows_in_db = return_num_of_rows();
$rows_left = $max_rows - $rows_in_db;
$total_iterations = ceil($rows_left/$batch_size);
// POPULATE DATABASE via batch population
for ($i=0; $i < $total_iterations; $i ++)
{
$batch_data = array();
for ($p=0; $p < $batch_size; $p++) //BATCH CREATION/PROCESSING
{
randomize_entries();
$id = $rows_in_db+$i*$batch_size+$p; // [ (NUMBER OF BATCHES) * (SIZE OF BATCHES) ] * (NUMBER OF ROWS IN CURRENT BATCH)
$batch_data[$p] = [$id,$_SESSION['miles'],$_SESSION['food'],$_SESSION['day'],$_SESSION['number']];
}
$placeholders = implode(',',array_fill(0,$batch_size,'(?,?,?,?,?)')); //make an array with
$values = array_merge(...$batch_data);
$query = "INSERT INTO $database.$table (ID,MILESWALKED, FAVORITEFOOD, FAVORITEDAY, FAVORITENUMBER) VALUES $placeholders";
// Execute the SQL query with the batch data
$types = str_repeat("idssi",$batch_size);
execute_sql($query, $types, ...$values);
}
?>
Alright now to explain this dark-magic. Firstly, the usual, some pertinent variables are defined; these include the number of rows in the database, the desired number of rows, the batch size, and the number of iterations necessary -given the batch size- to populate the table to the number of desired rows. Secondly, it sets up a nested "for" loop; the inner loop makes each individual "batch" while the larger loop handles every batch after each batch is finished. Before the inner loop is called, a new array is created to hold batch information. After this, the inner "batch" loop commences. It first calls the nail-bitingly exciting, randomize_entries, which has had too much blog space already so I won't go into it for a second time. The inner loop then assigns the ID based on the current position within the loop and how much of the database is already populated. After we have our ID, we assign the rest of the information with it which includes our newly randomized, and don't forget personal, details from the functions script. It adds these random entries to the array and once the array has the number of desired entries (specified by the batch size), it exits the loop with its shiny, new array. From here it uses the implode function to just create a string with "(?,?,?,?)" written over and over again -up to the number specified by the "batch_size" variable. We then define our long-awaited query as an insert on the table we defined in the initialization script. Then, penultimately, we define the expected types of our parameters as "idssi". "What does that even mean?", well thanks for asking, imaginary person I'm using as a segway into further discussion. The "types" parameter is a string of characters that defines what data types should be expected by the myslqi_stmt_bind_param function; in this case, it defines the types as "idssi" which is computer non-sense talk for "integer,decimal,string,string,integer". These match up with the data types we provided for each row. The str_repeat part of this script just repeats those 5 letters many, many times since we've loaded all of our batch row data onto a single, long array which has our data points, all of which need to have an expected data type. Lastly, (if you've read up to this point, you've earned a single firm handshake and maybe a nod) we execute this massive query statement which inserts the entire batch, all-at-once, into the database just a few *hundred-thousand* times.
Wow, talk about a lot (maybe I am a crazed database-obsessed coder). If you got that callback to earlier, you get the nod, if you get that reference to later from earlier, you don't get anything more, don't be greedy. That's the first couple of days of development and progress on the Kickback Kingdom's PHP Application Specialist certificate. Next, I'm gonna get more of the UI finished and then move on to the navigation of all this data. Check out my blogs on the development of L.I.C.H. (which are significantly less mind-numbingly "databasey"), as they come out, for progress updates on my game. If you read all of this, thank you, if you didn't read all of this, I don't blame you.
Until next time,
Hans
Coming Soon
Our website is currently under construction. We are working as fast as we can to bring you great content! :)
Stay tuned for something amazing!