Berkeley DATA 100: Principles and Techniques of Data Science

DATA 100 is the upper-division core of Berkeley's Data Science major, covering the full pipeline: pandas data wrangling, visualization, sampling, regular expressions, SQL, linear models, gradient descent, regularization, and classification with scikit-learn. It follows DATA 8 and runs at enormous scale.

Fennie is independent and not affiliated with UC Berkeley. This is an unofficial study guide.

What makes it hard

The exams are the notorious part: long, time-pressured, and full of read-this-pandas-expression questions where one misremembered method behavior cascades. The course also quietly assumes real comfort with linear algebra and probability; students who skated through the prerequisites feel it when loss functions and gradient descent arrive.

What you'll cover

• Pandas and data wrangling
• Visualization and EDA
• Sampling and bias
• Regular expressions and SQL
• Linear models and gradient descent
• Regularization and cross-validation
• Classification and logistic regression

The DATA 100 study guide

How to study for Berkeley DATA 100, step by step.

1
Make pandas fluency a daily habit
Exam questions hand you expressions with groupby, merge, and indexing chains and ask what comes out. Write and predict small pandas snippets daily; reading fluency under time pressure is a trained skill.
2
Shore up the math the course assumes
Loss functions, gradients, and the linear-model material lean on MATH 54-level linear algebra and probability. Review before those weeks arrive; the course won't slow down for the prerequisite gap.
3
Do the assignments without copy-paste
The homework notebooks make it tempting to pattern-match from lecture code. Typing and modifying everything yourself is what builds the recall the closed-book exam sections demand.
4
Build a methods cheat-sheet as you go
Track the pandas methods, regex syntax, and sklearn patterns the course uses, with one example each. Even where reference sheets are allowed, the act of building it is the studying.
5
Work past exams timed, because pacing is the exam
DATA 100's exam difficulty is substantially time pressure. Past exams from the course site, worked under strict timing, are the only honest rehearsal for the pace.

Today

Today's DATA 100 plan

Preview

65 min

What a Fennie Daily Plan looks like for DATA 100. Yours is built from your own syllabus and adapts every day to your deadlines and progress.

0 / 4 done~65m remaining

Keep this plan free

First plan free, no card required. Fennie is independent and unaffiliated with your school.

FAQ

Is DATA 100 hard?

Harder than its DATA 8 prerequisite suggests. The exams are long and time-pressured, and the modeling half assumes real linear algebra and probability comfort. Students who practice reading code under timing and review the math early do well.

What's the difference between DATA 8 and DATA 100?

DATA 8 is the gentle intro with a custom teaching library; DATA 100 is the professional-tools course (pandas, SQL, sklearn) with deeper statistical modeling. The jump in pace and assumed math is significant by design.

How do I study for DATA 100 exams?

Drill reading and predicting pandas expressions daily, build a methods reference as you go, and work past exams under strict timing. Most students lose points to pace and method-behavior details, not to concepts they never learned.

More Berkeley courses

DATA 8: Foundations of Data Science

DATA 8 is Berkeley's intro data science course and one of the largest courses on campus, combining Python programming, statistical inference, and prediction with real datasets in Jupyter notebooks. It assumes no prior programming or statistics and anchors the Data Science major.

What makes it hard

What you'll cover

The DATA 100 study guide

Make pandas fluency a daily habit

Shore up the math the course assumes

Do the assignments without copy-paste

Build a methods cheat-sheet as you go

Work past exams timed, because pacing is the exam