Berkeley DATA 100: Principles and Techniques of Data Science
DATA 100 is the upper-division core of Berkeley's Data Science major, covering the full pipeline: pandas data wrangling, visualization, sampling, regular expressions, SQL, linear models, gradient descent, regularization, and classification with scikit-learn. It follows DATA 8 and runs at enormous scale.
Fennie is independent and not affiliated with UC Berkeley. This is an unofficial study guide.
Build my DATA 100 study planWhat makes it hard
The exams are the notorious part: long, time-pressured, and full of read-this-pandas-expression questions where one misremembered method behavior cascades. The course also quietly assumes real comfort with linear algebra and probability — students who skated through the prerequisites feel it when loss functions and gradient descent arrive.
What you'll cover
- • Pandas and data wrangling
- • Visualization and EDA
- • Sampling and bias
- • Regular expressions and SQL
- • Linear models and gradient descent
- • Regularization and cross-validation
- • Classification and logistic regression
The DATA 100 study guide
How to study for Berkeley DATA 100, step by step.
- 1
Make pandas fluency a daily habit
Exam questions hand you expressions with groupby, merge, and indexing chains and ask what comes out. Write and predict small pandas snippets daily — reading fluency under time pressure is a trained skill.
- 2
Shore up the math the course assumes
Loss functions, gradients, and the linear-model material lean on MATH 54-level linear algebra and probability. Review before those weeks arrive — the course won't slow down for the prerequisite gap.
- 3
Do the assignments without copy-paste
The homework notebooks make it tempting to pattern-match from lecture code. Typing and modifying everything yourself is what builds the recall the closed-book exam sections demand.
- 4
Build a methods cheat-sheet as you go
Track the pandas methods, regex syntax, and sklearn patterns the course uses, with one example each. Even where reference sheets are allowed, the act of building it is the studying.
- 5
Work past exams timed — pacing is the exam
DATA 100's exam difficulty is substantially time pressure. Past exams from the course site, worked under strict timing, are the only honest rehearsal for the pace.
- 6
Run the pipeline on a Daily Plan with Fennie
Upload the DATA 100 schedule and Fennie's Daily Plans keep the coding, math, and exam-prep threads moving together, generating pandas-reading drills and concept quizzes from your actual course materials. Free to start.
Start my DATA 100 plan free
How Fennie helps with DATA 100
Upload the DATA 100 schedule and Fennie's Daily Plans interleave the coding, math review, and exam drills the course demands simultaneously. Chat through what a gnarly pandas chain actually returns or why regularization changes a model, and run timed generated quizzes that rehearse the exams' read-and-predict format.
FAQ
Is DATA 100 hard?
Harder than its DATA 8 prerequisite suggests — the exams are long and time-pressured, and the modeling half assumes real linear algebra and probability comfort. Students who practice reading code under timing and review the math early do well.
What's the difference between DATA 8 and DATA 100?
DATA 8 is the gentle intro with a custom teaching library; DATA 100 is the professional-tools course — pandas, SQL, sklearn — with deeper statistical modeling. The jump in pace and assumed math is significant by design.
How do I study for DATA 100 exams?
Drill reading and predicting pandas expressions daily, build a methods reference as you go, and work past exams under strict timing. Most students lose points to pace and method-behavior details, not to concepts they never learned.
Pass DATA 100 with a plan, not a cram
Upload your DATA 100 materials and Fennie generates a Daily Plan paced to your deadline — plus chat, flashcards, and quizzes built from the actual course content.
Get started free