56:219:601; 56:824:719 directed (independent) study
special problems / colloquium [aka "data (science) project"] AND
56:219:603 / 56:219:701 data science capstone/thesis current syllabus
https://theaok.github.io/dirStu
Fall 2025; Tue?? or Thu?? time??, computer lab in the back of the first fl [note, you can also use the lab outside of the class time--just stop by my office and ask me for the key; this semester i am here especially on Tue and Thu]
instructor
Adam Okulicz-Kozaryn adam.okulicz.kozaryn@gmail.com
office: 321 Cooper St, room 302; office hours: TBA, and by appointment
this semester always at school on Tue and Thu; usually whole day; stop by
prerequisites
You need to be comfortable using a computer. Some minimum knowledge of Python, R,
or Stata and data-management/computer science is necessary, such as my
data management, visualization or GIS courses: https://theaok.github.io/teach
course description
Essentially, this is
a class where we are working on a
publishable paper .
learning objectives/outcomes
demonstrate mastery of the material by writing code for a project/paper using; you may cowrite code (upto 2 people) but then the project should be 2 times better than a single-authored paper
required textbooks and materials
There are no required textbooks. All required materials (code, readings) will be provided.
requirements
Consistency is the key. Sure, some flexibility is there, you
can miss a week or maybe even 2. Indeed working hard continously for
some time and then taking time off may be the way to go. But if you dont do anything for
couple weeks, you will be in trouble. Definitely, if you dont do
much throught the semester and try to make it up in the final few
weeks, it wont work.
Strictly speaking an advice, rather than a requirement, but in
practice really a requirement, as it is virtually impossible to
succeed otherwise! Ask often many questions.
Students will write an empirical paper/report or github page etc on any topic using one or more of the
techniques covered in this course. A typical paper will be 5 to 20
double spaced pages. I will give you comments and help with the
paper, and it is a good opportunity to produce a paper.I will
also grade the code that you wrote to produce the results in your paper.
You will submit not only paper, but also code that produced results in
the paper; in fact, you can just submit the code.
Ideally, the paper should be submitted to an academic journal for a publication.
lets plan around midsemester to have a solid draft to know where
we stand
we will try to meet as a group in the beginning to go over
basics, brainstorm and get going; and towards the end to present and
learn from each other; in the middle we probably would work more one
on one
make no mistake, this is not walk in the park, the bar is high: to get an A it has to be "publishable" at the end of semester or at least "publishable" after 1 set of easly doable revisions as per my final comments
data science independent study examples
timeseries / arima
https://colab.research.google.com/github/chetan-957/Independent-Study/blob/main/Bitcoin.ipynb
timeseries / sentiment analysis
https://colab.research.google.com/github/sg2083/independent_study/blob/main/Sentiment_analysis_29_04_wip.ipynb
56:219:603 / 56:219:701 data science capstone/thesis
lets go over (note these are from sp2025, and may be updated):
The Data Science Master's project is a capstone project: it is the
culmination of all the coursework that has brought to this point.
You will want to make sure that this project is something that you can highlight prominently on your resume!
Here are the expectations in brief (and some of these points will be addressed in more detail in forthcoming modules):
The project will be a substantial software application involving
data science techniques and tools. This means that at a minimum, it
must involve substantial data volume, appropriate data-cleaning,
exploratory data analysis, visualization,
and domain-specific data analysis that involves data mining, machine-learning or AI techniques.
It can be a group project but with no more than two students per group. Complexity of group projects is expected to scale commensurately with group size.
The project deliverables will include a class demonstration, a comprehensive written report, and a code archive evaluated by the advisor.
This is important: the project must be substantially different
from any previous projects, independent study or coursework projects
that you have already completed. I intend to be very strict about this.
Overall, this course will be run in hybrid format:
There will be two initial in-person class meetings from 12:30pm to
1:45pm in Armitage 105 on Monday, January 27 and Wednesday, January
29.
The Monday meeting will provide additional details about required
elements: code organization, documentation, and report format. In the
Wednesday meeting
each student (or group of two) will present a flash talk (a 5-minute presentation) on their project idea.
After next week, students will maintain a regular cadence of meetings with their project advisors by mutual agreement.
Presentations of projects will be on April 30 and May 5. Details will be provided sometime in the middle of the semester.
Please let me know immediately by email if (a) you are among the
students working with Dr. Dehzangi, Dr. Sanchirico or
Dr. Okulicz-Kozaryn,
or (b) you will be working with me on the Master's capstone. If you
are in category (b),
then also indicate in your email any specific areas or data sources
you have in mind for your project: I am available today from 12:30pm
to 2pm and later from 3pm to 4:30pm in my office if you want to meet me in connection with this.
Finally, you must create a repository for your project work:
I will require access to the repository to check progress, regularity of commits, and addressing of issues or pull requests from me or your project advisor
https://rutgers.instructure.com/courses/345052/
datasets; and tools and guidelines
and
https://rutgers.instructure.com/courses/345052/discussion_topics/4095502
per info on capstone projects
for examples from the past pull up from my gmail (commented out)
https://huggingface.co/spaces/Krish264/NutriWeb
https://github.com/pavansatya/NutriWeb
just to be safe, delete the data you have posted online, you never know: someone may be picky about it
rules
do not share or link to class videos!
These videocasts and podcasts are the exclusive copyrighted property of Rutgers University and the Professor teaching the course. Rutgers University and the Professor grant you a license only to replay them for your own personal use during the course. Sharing them with others (including other students), reproducing, distributing, or posting any part of them elsewhere -- including but not limited to any internet site -- will be treated as a copyright violation and an offense against the honesty provisions of the Code of Student Conduct. Furthermore, for Law Students, this will be reported by the Law School to the licensing authorities in any jurisdiction in which you may apply to the bar.
attendance
Attendance is recommended. Be advised that you are
responsible for any material covered in the class, whether or not it was in the readings or
lecture notes. You are also responsible for any announcements made in class. For most
students, attendance is simply essential to learning the material. If you do need to miss a
class, be sure to consult with a fellow student to learn what transpired.
incompletes: Generally speaking, the material in this course is best learned as a single unit. I
will grant incompletes only in cases where a substantial change in life circumstances occurs that
is beyond the control of the student, and only with appropriate
documentation.
study groups. You are encouraged to form a regular study group. Many students over the years
have found the study groups to be very helpful. Study groups are permitted and encouraged to
work on the problem sets together. However, each individual student should write up his or her
own answer to hand in, based on his or her own understanding of the material. Do not hand in a
copy of another person’s problem set, even a member of your own group. Writing up your own
answer helps you to internalize the group discussions and is a crucial step in the learning process.
Academic Integrity. I am very serious about this. Make no
mistake--I may appear accommodating and informal--but I am extremely
strict about academic integrity. Violations of academic integrity include cheating on tests or handing in
assignments that do not reflect your own work and/or the work of a study group in which you
actively participated. Handing in your own work that was performed not
for this class (e.g. other class, any other project) is cheating,
too. I have a policy of zero tolerance for cheating. Violations will be referred
to the appropriate university authorities.
For more information see http://fas.camden.rutgers.edu/student-experience/academic-integrity-policy
Accommodating Students with Disabilities.
Any student with a disability affecting performance in the class
should contact the disability office ASAP: http://learn.camden.rutgers.edu/disability/disabilities.html