lds.set_index("Area").filter(regex = "ER_.*").rename(columns = lambda s: int(s[3:])).T[["South West", "London"]].plot()
Hiya! I'd like to start using julia and Pluto instead of Jupyter, but I'm finding it tough to see how to do similar things.
How would I write code to do the same thing as this pandas/matplotlib code? It's selecting the rows with South West and London for their "Area", and graphing the values in the ER_{2005, 2006, ..., 2019}
columns for each.
Hey! I think AlgebraOfGraphics.jl could be awesome for this. Could you share an example of what lds
looks like and the corresponding plot you are trying to make?
Generally, DataFrames.jl is the pandas replacement, with helper packages like DataFramesMeta.jl (among others). Plots.jl is the general plots package, but StatsPlots.jl has a nice interface for working with DataFrames specifically.
I don't know pandas, but based on your description, something like this would be equivalent:
julia> using DataFrames, Random, StatsPlots
julia> df = DataFrame(area = ["South West", "Edinburgh", "London", "South West", "Morecombe"], ER_2005 = rand(1:100, 5), ER_2015 = rand(1:100, 5), something = map(_->randstring(), 1:5))
5×4 DataFrame
Row │ area ER_2005 ER_2015 something
│ String Int64 Int64 String
─────┼─────────────────────────────────────────
1 │ South West 71 83 aZQ5TM4K
2 │ Edinburgh 98 99 BLzqoIFg
3 │ London 39 15 u43jnMLH
4 │ South West 58 49 EtrFImPX
5 │ Morecombe 100 84 1U773XwD
julia> @df df[in(["South West", "London"]).(df.area), r"ER_.*"] plot(cols())
See also: https://dataframes.juliadata.org/latest/man/comparisons/#Comparison-with-the-Python-package-pandas
Yeah! These are the libraries I've been using, though I've been struggling to find alternatives to all of pandas' methods
It seems like the shape of the API is a little different, so translations probably wont be one to one
I've been finding that a lot of seemingly common operations in R and pylab are quite difficult in Julia though - matrix transpositions are common enough for python to have the short .T
accessor for them, but there's no blessed implementation I could find.
Is this because I'm looking in the wrong places? Or does Julia expect more things to be manually implemented.
Oh, tangential question: The postfix methods are quite a lot easier to chain, is there a way to write them in the same order?
And thanks for your example code @Sundar R ! It's nice to know I wasn't too far off, this is almost exactly what I was using already. My trouble is with extending it - I want the axes' values to be properly labelled from the names ("ER_nnnn" is being parsed to an int in the python), and I'm plotting multiple rows in the line chart. That's matplot
in R.
I think the ecosystem just isn't as mature, and I'll have to stick to python's notebooks for now.
It's not immature, it's just different.
I highly recommend to read Bogumil blog on DataFrames to get better understanding. For example, you can start with https://bkamins.github.io/julialang/2020/12/24/minilanguage.html May be it is outdated, I do not know, but still it gives very good introduction in dataframes workflow.
Also, there are packages which provide some syntax sugar, for example Chain.jl, Pipe.jl, Underscores.jl. May be you'll find them useful.
Couldn't agree more. Here's a quick example using some of those tools to start getting your plot off the ground, I just split out some of the parts for clarity. With Pluto's built-in package manager, running the notebook after you download it should just work™, but I might be pushing it with the Python example at the end :snake:
Screenshot-from-2022-01-13-13-27-13.png
Last updated: Nov 06 2024 at 04:40 UTC