56:219:523 GEOGRAPHIC INFORMATION SYSTEMS FOR DATA SCIENCE
https://theaok.github.io/gisPy most current syllabus (class materials edited continuously)
rugispy@googlegroups.com listserv (everyone in class gets these emails, use often!) [email me if you didn't get welcome email or can't email listserv (i may need to add your alternative email! only added roster's email)]
labs: during office hours (see below)

Fa 2023; Tue *and* Thu 3:35-4:55pm, BSB-336

prerequisites

You need to be comfortable using a computer, and able to write simple code in Python or willing to learn it. Most of the class material is simple coding in Python. There is no prerequsite to know Python, you can learn it in the class.

course description

Introductory + applied: produce maps, put interesting info on them. GIS is useful in all fields that have any geographic/location info.

course objectives

required books

none

software

We'll use Python >=3.10 (latest: 3.11) (python.org). Can download for free for Linux, Win, Mac. We will use several libs, mostly GeoPandas.
BUT no need to download or install any software: we will run Python online in webbrowser in the cloud, so called "Colab" (2 sections down). But first lets get GitHub running.
GitHub
We will use GitHub to store the Python code in form of a notebook, and we will edit (and run) the notebook in colab (next sec).
  • sign up or login at github.com
  • (depending on os, browser) on top left hit "New" or "Create Rpository" or top right under plus "+" select "New repository"
  • pick some repository name, say "vis" ; keep selected 'Public'; important!: under "Initialize this repository with" check "Add a README file"; and hit at the bottom "Create repository"
  • then hit "Settings" towards the middle-top right; on the left select "Collaborators" tab and hit "Add people" : "theaok", and hit "Add theaok to this repository"
    workflow: my comments, diffs, inline response [lets go over this next week again]
  • i will run it in my Colab, edit, and upload back
  • diff and response to my comments: actually cleaner and better in colab: File-Revision history; or clunky in GitHub: can click my commit message and see the so called diff--the difference between your version and my version: important! do make sure to fix it up for next ps, you may even have inline response to my comments in your next ps (especially if sth complex or if you disagree)
  • dont forget about a meaningful commit message--can keep on uploading newer versions as many times as you like
  • note: when you click the file, you can then click 'History' and see how the file evolved over time :)
  • a thought about file naming: ps1.ipynb, ps2.ipynb, etc, or ps1, ps2, etc sections in one file; or just one file and keep it updating throught with new stuff as we go!
  • [*] bonus/extra: general references on how to get started using Git fully,
  • http://www.sitepoint.com/git-for-beginners/
  • http://rogerdudler.github.io/git-guide/
  • colab
    Just run Py notebook in Colab and save subsequent versions in Github that will keep track of changes [stick with this for the ps]
  • go to https://github.com/theaok/gisPy/blob/main/map.ipynb and hit 'open in colab' OR go to https://colab.research.google.com and on popup pick GitHub, search for:
    https://github.com/theaok/gisPy/blob/main/map.ipynb
    (it should find it, and load it into colab, and follow instructions at the top of the file, ie save it in your GitHub etc)
  • data

    The class is a bit like an independent study: you will carry out some research (by making maps). You need your own data for this class ASAP, the more data and the more complex, the better. Software will need to load the data straight up from online! Some data are easily downloadable from online eg https://gss.norc.org/get-the-data/stata, but many are not. Then you have to put data online yourself [just go over Git<25mb]: https://theaok.github.io/generic/howToPutDataOnline.html

    icpsr: biggest repository of survey data; check out also var search
    google is great for data search; and it has data search, too
    google cloud/big query has data ,too
    kdnuggets listing of sources, a lot!; kdnuggets is great in general for data science
    another kdnuggets listing; maybe actually better start here, easier to wrap your head around
    kaggle

    NOAA
    NASA

    datsets on GitHub
    datahub
    humandata
    academictorrents
    pew
    data.gov
    nasdaq

    and as we have more editions of this class i will fork best projects and tag them and link from here

    advice/requirements and grading

  • 2 keys to success: start early AND ask often many questions; (and study groups: get couple people on zoom, screenshare notebooks, etc) This is a software class. It is different non-software classes. You will get stuck often and whenever stuck, email listserv, ask me, ask your classmates, as opposed to pulling your hair out! And stop by my office, too. Googling (and chatGPT?) solves most problems but for many things its better to talk to me and your classmates; also more social/human, if you talk to computer all the time, its not healthy.
  • Problem sets (ps): You will write computer code that does something that we covered in the class to your data. You may work in groups (<=4), but say who you worked with, and the more people in the group, the better/longer the code must be.
  • grading (strict and harsh!) [incompletes only if documented emergency (eg hospitalization)]

    academic calendar

    tentative, most uptodate always online, I work on class materials continuously and theyll be changing slightly
    print several slides on one sheet, say 6
    or just annotate electronic pdf

    dive into GIS

    sep5,7 intro sep5vid


    sep7 sep7vid
    anyone made map yet? want to present?; and lets upload a dataset to github; and lets go over github, colab, py code mechanics again and slower; also: ps0 think not just data but also u/a or lev of ana; perhaps flip the class!

    sep12,14 data: join/merge sep12vid


    sep14 start with last weeks ex abortions sep14vid

    sep19,21 pretty maps sep19vid


    sep21 sep21vid
  • final_project.pdf: just skim through TOC
  • [*] early/volunteer student presentations of maps from ps1
  • flip the class: (I walk around and sit with each of you and go over it; and Q and A; otherwhise Id bee looking at your colabs, and then approach you with ideas)
  • sep26,28 ps1 presentations

    sep26vid
    sep28vid
    5min sharp: i will cut you off! + 10min discussion

    oct3,5 (more advanced) thematic mapping: geopandas bells and whistles oct3vid


    oct5 pick up with subset shapefile in ipynb oct5vid

    oct10,12 wrapping up basics oct10vid

    flip the class
    oct12
    oct12vid

    oct17 ps2 presentations

    oct17vid

    oct19 (thu): CLASS CANCELLED (conflict with events/awards committee duty)

    interactive maps (zoom, move, popup, etc): folium

    note: what we cover from now on is not nesessary for ps, but it does help, pick from the following what is useful and helpful for your research

    oct24,26 go over ps2 comments from listserv and geo-processing oct24vid

    oct26 oct26vid

    oct31,nov2 folium oct31vid

    nov2vid

    nov7,nov9 folium nov7vid

    nov14,nov16 ps3 presentations nov14vid and nov16vid

    nov21 flip the class work on ps4; nov23: no class/Thanksgiving nov21vid


    spatial correlation: geoda

    nov28, nov30 geoda nov28vid

    nov30 nov30vid

    dec5, dec7 ps4/final presentations dec5vid dec7vid

    dec12 class summary/wrap up

    if time: flip the class (I walk around and sit with each of you and go over it; and Q and A; otherwhise Id bee looking at your colabs, and then approach you with ideas) [i fork couple best repos as example for future classes]

    rules

    attendance: strongly recommended, you're responsible for everything covered, incl discussions and announcements. If you miss a class, consult with a fellow student and/or watch video.

    academic integrity. I am very serious about this. Make no mistake--I may appear accommodating and informal--but I am extremely strict about academic integrity. Violations of academic integrity include cheating on tests or handing in assignments that do not reflect your own work and/or the work of a study group in which you actively participated. Handing in your own work that was performed not for this class (e.g. other class, any other project) is cheating, too. I have a policy of zero tolerance for cheating. Violations will be referred to the appropriate university authorities. For more information see http://fas.camden.rutgers.edu/student-experience/academic-integrity-policy

    accommodating students with disabilities. Any student with a disability affecting performance in the class should contact the disability office ASAP

    do not share or link to class videos! These videocasts and podcasts are the exclusive copyrighted property of Rutgers University and the Professor teaching the course. Rutgers University and the Professor grant you a license only to replay them for your own personal use during the course. Sharing them with others (including other students), reproducing, distributing, or posting any part of them elsewhere -- including but not limited to any internet site -- will be treated as a copyright violation and an offense against the honesty provisions of the Code of Student Conduct. Furthermore, for Law Students, this will be reported by the Law School to the licensing authorities in any jurisdiction in which you may apply to the bar.

    civic engagement component (opportunity for extra credit!)

    Start early. Start thinking about how you want to engage civically today.

    typical civic engagement

    Universities and social science should serve society. You are encouraged have to engage with local community.

    The idea is that you engage civically using research methods. There are several ways to do it. Ideally, you will partner with a local organization, obtain data from them, do some analysis, and present results to them. You may also use government data, say from census bureau, and present relevant information to locals. A local organization can be Rutgers research institute such as WRI, CURE, LEAP or any other organization such as school or soup kitchen or CamConnect. Rutgers Office of civic engagement may be able to help you contact them. The key idea is partnership: you will use tools from this class to produce output useful to local community. This is similar to taking a role of an apprentice at a local organization or serving as a consultant.

    Using real world data poses challenges, which is a part of exercise. Presenting your findings to stakeholders outside of a class is also challenging. At the same time, it is fairly easy to contribute locally by using simple tools learned in this class. For instance, simple comparison of means between two schools in Camden can be revealing and helpful locally.

    An obvious way would be to use data at your workplace or at a workplace of someone you know. However, you need to make sure that it serves society in some way. For instance, it would be straightforward if you work at a hospital or school or fire department; but it would be difficult if you work at Starbucks.

    atypical civic engagement--CONTACT ME FIRST if you consider this!

    Successful completion of atypical civic engagement will take estimated at least double of the typical civic engagement time.

    You could try to engage at regional or State level-for instance, you may evaluate some policy in NJ as compared to NY, or produce descriptive statistics of a region that would be useful regionally (e.g. my South Jersey WRI paper http://dept.camden.rutgers.edu/rand-institute/files/changes-across-the-region.pdf Such type of engagement typically requires substantial research experience typically found at late stage of PhD program. There may also be some other atypical ways-let me know your ideas.