56:219:601; 56:824:719 directed (independent) study special problems / colloquium [aka "data (science) project"] AND 56:219:603 / 56:219:701 data science capstone/thesis current syllabus https://theaok.github.io/dirStu
Fall 2025; Tue?? or Thu?? time??, computer lab in the back of the first fl [note, you can also use the lab outside of the class time--just stop by my office and ask me for the key; this semester i am here especially on Tue and Thu]

instructor
  • Adam Okulicz-Kozaryn adam.okulicz.kozaryn@gmail.com
  • office: 321 Cooper St, room 302; office hours: TBA, and by appointment
  • this semester always at school on Tue and Thu; usually whole day; stop by
  • prerequisites

    You need to be comfortable using a computer. Some minimum knowledge of Python, R, or Stata and data-management/computer science is necessary, such as my data management, visualization or GIS courses: https://theaok.github.io/teach

    course description

    Essentially, this is a class where we are working on a publishable paper .

    learning objectives/outcomes

  • demonstrate mastery of the material by writing code for a project/paper using; you may cowrite code (upto 2 people) but then the project should be 2 times better than a single-authored paper
  • required textbooks and materials

    There are no required textbooks. All required materials (code, readings) will be provided.

    requirements

  • Consistency is the key. Sure, some flexibility is there, you can miss a week or maybe even 2. Indeed working hard continously for some time and then taking time off may be the way to go. But if you dont do anything for couple weeks, you will be in trouble. Definitely, if you dont do much throught the semester and try to make it up in the final few weeks, it wont work.
  • Strictly speaking an advice, rather than a requirement, but in practice really a requirement, as it is virtually impossible to succeed otherwise! Ask often many questions.
  • Students will write an empirical paper/report or github page etc on any topic using one or more of the techniques covered in this course. A typical paper will be 5 to 20 double spaced pages. I will give you comments and help with the paper, and it is a good opportunity to produce a paper.I will also grade the code that you wrote to produce the results in your paper. You will submit not only paper, but also code that produced results in the paper; in fact, you can just submit the code. Ideally, the paper should be submitted to an academic journal for a publication.
  • lets plan around midsemester to have a solid draft to know where we stand
  • we will try to meet as a group in the beginning to go over basics, brainstorm and get going; and towards the end to present and learn from each other; in the middle we probably would work more one on one

  • make no mistake, this is not walk in the park, the bar is high: to get an A it has to be "publishable" at the end of semester or at least "publishable" after 1 set of easly doable revisions as per my final comments

    data science independent study examples

    timeseries / arima https://colab.research.google.com/github/chetan-957/Independent-Study/blob/main/Bitcoin.ipynb timeseries / sentiment analysis https://colab.research.google.com/github/sg2083/independent_study/blob/main/Sentiment_analysis_29_04_wip.ipynb

    56:219:603 / 56:219:701 data science capstone/thesis

    lets go over (note these are from sp2025, and may be updated):

    The Data Science Master's project is a capstone project: it is the culmination of all the coursework that has brought to this point. You will want to make sure that this project is something that you can highlight prominently on your resume!
    Here are the expectations in brief (and some of these points will be addressed in more detail in forthcoming modules):
    The project will be a substantial software application involving data science techniques and tools. This means that at a minimum, it must involve substantial data volume, appropriate data-cleaning, exploratory data analysis, visualization, and domain-specific data analysis that involves data mining, machine-learning or AI techniques. It can be a group project but with no more than two students per group. Complexity of group projects is expected to scale commensurately with group size. The project deliverables will include a class demonstration, a comprehensive written report, and a code archive evaluated by the advisor. This is important: the project must be substantially different from any previous projects, independent study or coursework projects that you have already completed. I intend to be very strict about this.
    Overall, this course will be run in hybrid format:
    There will be two initial in-person class meetings from 12:30pm to 1:45pm in Armitage 105 on Monday, January 27 and Wednesday, January 29. The Monday meeting will provide additional details about required elements: code organization, documentation, and report format. In the Wednesday meeting each student (or group of two) will present a flash talk (a 5-minute presentation) on their project idea. After next week, students will maintain a regular cadence of meetings with their project advisors by mutual agreement. Presentations of projects will be on April 30 and May 5. Details will be provided sometime in the middle of the semester.
    Please let me know immediately by email if (a) you are among the students working with Dr. Dehzangi, Dr. Sanchirico or Dr. Okulicz-Kozaryn, or (b) you will be working with me on the Master's capstone. If you are in category (b), then also indicate in your email any specific areas or data sources you have in mind for your project: I am available today from 12:30pm to 2pm and later from 3pm to 4:30pm in my office if you want to meet me in connection with this.
    Finally, you must create a repository for your project work: I will require access to the repository to check progress, regularity of commits, and addressing of issues or pull requests from me or your project advisor



    https://rutgers.instructure.com/courses/345052/ datasets; and tools and guidelines

    and https://rutgers.instructure.com/courses/345052/discussion_topics/4095502 per info on capstone projects

    for examples from the past pull up from my gmail (commented out) https://huggingface.co/spaces/Krish264/NutriWeb https://github.com/pavansatya/NutriWeb
    just to be safe, delete the data you have posted online, you never know: someone may be picky about it

    rules

    do not share or link to class videos! These videocasts and podcasts are the exclusive copyrighted property of Rutgers University and the Professor teaching the course. Rutgers University and the Professor grant you a license only to replay them for your own personal use during the course. Sharing them with others (including other students), reproducing, distributing, or posting any part of them elsewhere -- including but not limited to any internet site -- will be treated as a copyright violation and an offense against the honesty provisions of the Code of Student Conduct. Furthermore, for Law Students, this will be reported by the Law School to the licensing authorities in any jurisdiction in which you may apply to the bar. attendance Attendance is recommended. Be advised that you are responsible for any material covered in the class, whether or not it was in the readings or lecture notes. You are also responsible for any announcements made in class. For most students, attendance is simply essential to learning the material. If you do need to miss a class, be sure to consult with a fellow student to learn what transpired.

    incompletes: Generally speaking, the material in this course is best learned as a single unit. I will grant incompletes only in cases where a substantial change in life circumstances occurs that is beyond the control of the student, and only with appropriate documentation.

    study groups. You are encouraged to form a regular study group. Many students over the years have found the study groups to be very helpful. Study groups are permitted and encouraged to work on the problem sets together. However, each individual student should write up his or her own answer to hand in, based on his or her own understanding of the material. Do not hand in a copy of another person’s problem set, even a member of your own group. Writing up your own answer helps you to internalize the group discussions and is a crucial step in the learning process.

    Academic Integrity. I am very serious about this. Make no mistake--I may appear accommodating and informal--but I am extremely strict about academic integrity. Violations of academic integrity include cheating on tests or handing in assignments that do not reflect your own work and/or the work of a study group in which you actively participated. Handing in your own work that was performed not for this class (e.g. other class, any other project) is cheating, too. I have a policy of zero tolerance for cheating. Violations will be referred to the appropriate university authorities.

    For more information see http://fas.camden.rutgers.edu/student-experience/academic-integrity-policy

    Accommodating Students with Disabilities. Any student with a disability affecting performance in the class should contact the disability office ASAP: http://learn.camden.rutgers.edu/disability/disabilities.html