This is the summary of Stanford Compiler Lessons on Coursera
Intro
The basic steps of compile is
 lexical analysis
 parsing
 semantic analysis
 optimization
 code generation
Lexical Analysis
the implementation of lexical analysis contains two things:
first, partition the input string into lexemes
Second, identify the token class of each lexeme
Notice that this method mostly use Lefttoright scan, and lookahead sometimes is required.

Token Class(In COOL)
 whitespace: : a nonempty sequence of blanks, newlines, and tabs
 keyword: “if” or “else” or “then” or …
 identifier: strings of letters or digits, starting with a letter
 integer: a nonempty string of digits

Regular Language
The regular expressions over ∑ are the smallest set of expressions including
In regular languages there are rules descirbed in sets like:
 ℇ (Epsilon)
 ‘c’ (Single Charactor)
 A + B (Union)
 AB (Concatation)
 A* (Iteration)

Formal Language
Let ∑ be a set of characters (an alphabet). A language over ∑ is a set of strings of characters drawn from ∑ It is represeted as
L(e) = M
, wheree
stands for regular expression,M
stands for set of strings. In formal languages I’ve learned that multiple syntaxs of regular language can only point to one semantic meaning, which is the formal language. 
Lexical Specification
 Procedure:
 Write a rexp for the lexemes of each token class
 Construct formal language L(R), matching all lexemes for all tokens
 Let input be x1…xn For 1 ≤ i ≤ n check x1…xn ∊ L(R)
 If success, then we know that x1…xi ∊ L(Rj) for some j
 Remove x1…xi from input and go to (3)
 return ERROR token if all strings are not in the lexical specification
 Notice:
 one regular expression should match as long as a string can be.
 regular expression’s priority is determined by the specified order.
 An Error match should always have the lowest priority
 Procedure:

Finite Automata
 A finite automaton consists of
 An input alphabet ∑
 A set of states S
 A start state n
 A set of accepting states F ⊆ S
 A set of transitions s1 (on input a) > s2
 Deterministic Finite Automata (DFA)
 One transition per input per state
 No ℇmoves
 Faster to execute
 Nondeterministic Finite Automata (NFA)
 Can have multiple transitions for one input in a given state (which may ends in different results)
 Can have ℇmoves
 In general, smaller than DFA
 A finite automaton consists of

Implementing Finite Automata

Procedure: Lexical Specification > Regular Expressions > NFA > DFA > Tabledriven Implementation of DFA
 NFA > DFA
 Epsilon Closure: Pick one state, then look at all the states it can reach by following epsilon moves.
 So basically the transition is about finding epsilon closures in NFA of different inputs, then gather each epsilon closure up as one DFA state.
 A DFA can be implemented by a 2D table T (Acturally NFA can also do the trick,but the table would be huge…)
 One dimension is states
 Other dimension is input symbol
 For every transition Si (on input a) > Sk, define T[i,a] = k

Parsing
Finite Automata can really only express things where you can count modulus on k, where k is the number of states.
So we need parsing, which transfer sequence of tokens from lexer to parse tree of the program.