Skip to content

0x347 SQL

1. Foundation

SQL language is divided into several distinct parts:

  • schema statements: define the data structure in the database (e.g: create table)
  • data statement: manipulate the data structures (e.g: insert)
  • transaction statement: begin, end and roll back transactions

2. Data Types

2.1. Numeric

In mysql, we have following types, each type can be signed or unsigned


  • tinyint (1 byte)
  • smallint (2 byte)
  • mediumint (3 byte)
  • int (4 byte)
  • bigint (8 byte)

Floating point numbers:

  • float(p, s)
  • double(p, s)

2.2. String

Character data can be stored as either fixed-length CHAR or variable-length strings VARCHAR.

The character set can be choosed, utf-8 is the default encoding in MySQL.

Note the max length is different across different SQL implementations.

  • char: right-padded, will consume fixed space, up to 255 bytes in MySQL
  • varchar: not padded, will not consume fixed space, up to 65535 bytes in MySQL (originally 255), it will store 1,2 bytes prefix to indicate the length
  • text: up to 65535 in MySQL

2.2.1. String functions

Note that different SQL implements different API with similar features, here we list the MySQL's APIs.

  • char: int to char
  • ascii: char to int
  • length: length
  • concat: concatenation
  • strcmp: comparison
  • position: find
  • locate: find from an offset
  • substring: substring
  • insert: insertion

2.3. Temporal


  • date: YYYY-MM-DD
  • datetime: YYYY-MM-DD HH:MI:SS (1000 ~ 9999)
  • timestamp: YYYY-MM-DD HH:MI:SS (1970-01-01 ~ 2038-01-18)
  • year: YYYY
  • time: HHH:MI:SS

3. Operators

3.1. Conditions Operators

Conditions is used to filter out unwanted data, these can be combined using operators such as not, or, and, parenthesis.

  • equality condition (e.g: =)
  • inequality conditioin (e.g: <> or !=)
  • comparision and range condition(e.g: <, >, between and)
  • membership condition: (e.g: in (A, B, C))
  • matching condition: (e.g: _, % wildcards and reg exp)
  • null condition: (e.g: is null, is not null)

3.2. Set Operators

Set operators takes more than 2 tables as their operands.

The following guidelines must apply when performing set operations on two data sets:

  • Both data sets must have the same number of columns
  • The data types of each column must agree

These are all sepcified by ANSI SQL, but MySQL did not implement intersect and except.


  • union, union all: the former removes the duplicates, the later does not
  • intersect: intersection
  • except: not

4. Select

Roughly, the flow to execute a select is as follows:

  • server checks the permission and syntax of the statement
  • query is handed to the query optimizer to determine the most efficient way to execute
  • optimizer picks up an execution plan, which the server uses to execute your query

Several components or clauses make up the select statement

  • select: determines columns
  • from identifies the table from which to retrieve data
  • where: determines rows (filter out unwanted rows)
  • group by: used to group rows together by common column values
  • having: filter out unwanted groups
  • order by: sorts the rows of the final result set by one or more columns (ascent by default)

From clause can take

  • permanent tables (created using create table)
  • derived tables (rows returned by a subquery and held in memory)
  • temporary tables
  • virtual tables (create view)

5. Aggregation


  • count(*): count all lines
  • count(col): count not NULL lines
  • count(distinct col): count distinct lines


6.1. Setup

# first time login using postgres peer authentication
sudo -u postgres psql

# create new user and new db from bash
sudo -u postgres createuser -s ec2-user
sudo -u postgres createdb ec2-user


-- show all tables

-- show schema of table
\d table_name

-- example
ec2-user=# \d
          List of relations
 Schema |  Name   | Type  |  Owner
 public | machine | table | ec2-user
 public | video   | table | ec2-user
 public | word    | table | ec2-user
(3 rows)

ec2-user=# \d word;
                            Table "public.word"
   Column   |            Type             | Collation | Nullable | Default
 word       | character varying(512)      |           |          |
 started    | boolean                     |           |          |
 done       | boolean                     |           |          |
 lang       | character varying(8)        |           |          |
 updated_at | timestamp without time zone |           |          |
 host       | character varying(255)      |           |          |

-- bulk copy 
\copy my_table from './test.csv' CSV

7. Reference

[1 Learning SQL O'Reilly