• jared@discsanddata.com
  • Siem Reap, Cambodia
Beginning Python
The Making of Jared’s Disc Golf Power Rankings, Part 1

The Making of Jared’s Disc Golf Power Rankings, Part 1

Ok. Maybe that’s a disingenuous title. Basically I sat in the darkness of my apartment, while my girlfriend wondered what was wrong with me, and typed stuff into a jupyter notebook. Then I commented that out and typed something else. Then I gave my screen a dirty look. Then I realized I typed a period instead of an underscore. Repeat.

What I actually want to focus on here are the online resources that helped me along the way. As always, this will most often be read by me as a repository of things I learned and useful websites. But if it helps you too, all the better.

(Standard warning applies. I’m new at this. My solutions are often not the best, and may be completely wrong. More elegant and correct solutions are welcomed in the comment section)

So what did I learn… or re-learn…

How do you iterate over rows of a Pandas Dataframe?

There are many ways it seems, but the first option offered by the geeks who are all about geeks at https://www.geeksforgeeks.org/different-ways-to-iterate-over-rows-in-pandas-dataframe/ has worked for me.

They say iterate through the indexes, like if ‘name’ and ‘player_id’ were columns in the dataframe ‘disc_golf_df’, and you wanted to iterate through the rows and print the values of each column…

for ind in disc_golf_df.index:
    print(disc_golf_df['name'][ind], disc_golf_df['player_id][ind])

Wicked, huh?

(Future Jared says: “Using iterrows() woulda prolly been better. It’s easier to call the values as values as opposed to as series’, I think)

Sup with the SettingWithCopyWarning?

As I’ve read and re-read the very helpful explanation of the issue by Dataquest at https://www.dataquest.io/blog/settingwithcopywarning/. I think I kinda get the gist. When you slice a dataframe, and then try to set a value, it’s ambiguous whether you want to set it in the original dataframe or the copy you made. Or something like that. The answer, if you want to alter the original dataframe, is to do the slicing and the setting at the same time using loc….

So, yo, in my dataframe disc_golf_df, whenever the value in the ‘status’ column is ‘DNF’, I want the ‘notes’ column to say, ‘maybe he pooped himself.’ (disc_golf_df is the most mature dataframe ever)…

This is how you can do it…

disc_golf_df[disc_golf_df.status == 'DNF', 'notes'] = 'maybe he pooped himself'

Hahahaha… still hilarious. And correct.

How do you sort a dataframe?

Did I really forget how to sort a dataframe? I guess I did. Dumbass. Data to Fish reminded me at https://datatofish.com/sort-pandas-dataframe/.

How do you split strings in Python?

Yeah, another thing I knew and forgot the syntax for. I needed to sort some of my BeautifulSoup scrapings to pull out the data I really wanted. In a shocking plot twist, to split strings, you use split() (with a separator if necessary). Thanks again Geeks.

How do you create null values in a dataframe?

Sometimes you want to get rid of your nulls. Sometimes nulls will do you just fine. If you want a null, put in np.nan. Don’t forget to import numpy as np. Thanks Data to Fish!

How do you delete rows or columns from a dataframe?

You wanna drop something from your dataframe? .drop() it. Just specify what you want to drop. Thanks pandas documentation!

How do you use Beautiful Soup to scrape web data?

It had been a long time since course three in my Python for Everybody Specialization from Coursera. I needed to remind myself how Beautiful Soup worked. The Dataquest tutorial at https://www.dataquest.io/blog/web-scraping-python-using-beautiful-soup/ is great for that…

How do you replace one value with another in a pandas dataframe?

I needed to use this when I wanted to replace me division names. “Open” bacame “MPO” and “Open Women” became “FPO.” Thanks to my new old friends at Data to Fish, I made it happen!

disc_golf_df['division'] = df['division'].replace(['Open'],'MPO')

disc_golf_df['division'] = df['division'].replace(['Open Women'],'FPO')

Bam!

How to you turn a column of a dataframe into a list?

A couple times I had to do this. It’s pretty easy. Use to_list(). Just like the Data to Fish people say at https://datatofish.com/convert-pandas-dataframe-to-list/.

How do you comment out a chunk of code in a jupyter notebook?

There’s a shortcut!

command + / (on a Mac) can change …

function dropRight(array, n=1) {
  const length = array == null ? 0 : array.length
  n = length - toInteger(n)
  return length ? slice(array, 0, n < 0 ? 0 : n) : []
}

into…

#function dropRight(array, n=1) {
#  const length = array == null ? 0 : array.length
#  n = length - toInteger(n)
#  return length ? slice(array, 0, n < 0 ? 0 : n) : []
#}

Sweet!

How do you create dynamic dataframe column names?

Whenever a new tournament was played (eg one with the nickname GHP for example, I would find myself typing out new columns like GHP_par, GHP_place, GHP_prize…

I thought there must be a better way. So I started thinking about string formatting like at https://www.geeksforgeeks.org/python-format-function/. That pushed me towards a stack overflow question that totally sorted me out. Just had to make my list of tourney nicknames and voila…

for tn in tourney_list:
    disc_golf_df[f'{tn]_par] = (whatever data source)
    disc_golf_df[f'{tn]_place] = (whatever data source)
    disc_golf_df[f'{tn]_prize] = (whatever data source)

That’s a game-changer…

How do I get the current week using python?

I wanted to give preference to more recent tournaments so I wanted to be able to calculate how many weeks ago a tournament was played. “Easy” says Rajendra from Tutorialspoint

import datetime
my_date = datetime.date.today() # if date is 01/01/2018
year, week_num, day_of_week = my_date.isocalendar()
print("Week #" + str(week_num) + " of year " + str(year))

gives you…

Week #1 of year 2018

Too easy!

What’s up with the division operator in python that always returns an integer?

“The real floor division operator is “//”. It returns floor value for both integer and floating point arguments. ” That’s what the Geeks for Geeks said.

6//2 = 3

11//5 = 2

That sorta thing…

To be continued in part 2

Leave a Reply

Your email address will not be published. Required fields are marked *