27
27
< pre > < code class ="julia hljs "> x = randn(< span class =hljs-number > 1_000</ span > ) < span class =hljs-comment > # 1_000 points iid from a N(0, 1)</ span >
28
28
μ = mean(x)
29
29
σ = std(x)
30
- < span class =hljs-meta > @show</ span > (μ, σ)</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> (μ, σ) = (-0.003429405251489602 , 1.0011574237294552 )
30
+ < span class =hljs-meta > @show</ span > (μ, σ)</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> (μ, σ) = (0.028927854346407975 , 1.0322439340512704 )
31
31
</ code > </ pre > </ div >
32
32
< p > Indexing data starts at 1, use < code > :</ code > to indicate the full range</ p >
33
33
< pre > < code class ="julia hljs "> X = [< span class =hljs-number > 1</ span > < span class =hljs-number > 2</ span > ; < span class =hljs-number > 3</ span > < span class =hljs-number > 4</ span > ; < span class =hljs-number > 5</ span > < span class =hljs-number > 6</ span > ]
@@ -50,27 +50,9 @@ <h2 id=loading_data ><a href="/MLJTutorials/pub/isl/lab-2.html#loading_data">Loa
50
50
< pre > < code class ="julia hljs "> < span class =hljs-keyword > using</ span > DataFrames</ code > </ pre >
51
51
< p > Let's load some data from RDatasets (the full list of datasets is available < a href ="http://vincentarelbundock.github.io/Rdatasets/datasets.html "> here</ a > ).</ p >
52
52
< pre > < code class ="julia hljs "> auto = dataset(< span class =hljs-string > "ISLR"</ span > , < span class =hljs-string > "Auto"</ span > )
53
- first(auto, < span class =hljs-number > 3</ span > )</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> 3×9 DataFrame
54
- │ Row │ MPG │ Cylinders │ Displacement │ Horsepower │ Weight │ Acceleration │ Year │ Origin │ Name │
55
- │ │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ String │
56
- ├─────┼─────────┼───────────┼──────────────┼────────────┼─────────┼──────────────┼─────────┼─────────┼───────────────────────────┤
57
- │ 1 │ 18.0 │ 8.0 │ 307.0 │ 130.0 │ 3504.0 │ 12.0 │ 70.0 │ 1.0 │ chevrolet chevelle malibu │
58
- │ 2 │ 15.0 │ 8.0 │ 350.0 │ 165.0 │ 3693.0 │ 11.5 │ 70.0 │ 1.0 │ buick skylark 320 │
59
- │ 3 │ 18.0 │ 8.0 │ 318.0 │ 150.0 │ 3436.0 │ 11.0 │ 70.0 │ 1.0 │ plymouth satellite │</ code > </ pre > </ div >
53
+ first(auto, < span class =hljs-number > 3</ span > )</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> DataFrames.DataFrame(AbstractArray{T,1} where T[[18.0, 15.0, 18.0], [8.0, 8.0, 8.0], [307.0, 350.0, 318.0], [130.0, 165.0, 150.0], [3504.0, 3693.0, 3436.0], [12.0, 11.5, 11.0], [70.0, 70.0, 70.0], [1.0, 1.0, 1.0], ["chevrolet chevelle malibu", "buick skylark 320", "plymouth satellite"]], DataFrames.Index(Dict(:Cylinders => 2,:Horsepower => 4,:MPG => 1,:Displacement => 3,:Origin => 8,:Year => 7,:Acceleration => 6,:Weight => 5,:Name => 9), [:MPG, :Cylinders, :Displacement, :Horsepower, :Weight, :Acceleration, :Year, :Origin, :Name]))</ code > </ pre > </ div >
60
54
< p > The < code > describe</ code > function allows to get an idea for the data:</ p >
61
- < pre > < code class ="julia hljs "> describe(auto, :mean, :median, :std)</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> 9×4 DataFrame
62
- │ Row │ variable │ mean │ median │ std │
63
- │ │ Symbol │ Union… │ Union… │ Union… │
64
- ├─────┼──────────────┼─────────┼────────┼──────────┤
65
- │ 1 │ MPG │ 23.4459 │ 22.75 │ 7.80501 │
66
- │ 2 │ Cylinders │ 5.47194 │ 4.0 │ 1.70578 │
67
- │ 3 │ Displacement │ 194.412 │ 151.0 │ 104.644 │
68
- │ 4 │ Horsepower │ 104.469 │ 93.5 │ 38.4912 │
69
- │ 5 │ Weight │ 2977.58 │ 2803.5 │ 849.403 │
70
- │ 6 │ Acceleration │ 15.5413 │ 15.5 │ 2.75886 │
71
- │ 7 │ Year │ 75.9796 │ 76.0 │ 3.68374 │
72
- │ 8 │ Origin │ 1.57653 │ 1.0 │ 0.805518 │
73
- │ 9 │ Name │ │ │ │</ code > </ pre > </ div >
55
+ < pre > < code class ="julia hljs "> describe(auto, :mean, :median, :std)</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> DataFrames.DataFrame(AbstractArray{T,1} where T[[:MPG, :Cylinders, :Displacement, :Horsepower, :Weight, :Acceleration, :Year, :Origin, :Name], Union{Nothing, Float64}[23.44591836734694, 5.471938775510204, 194.41198979591837, 104.46938775510205, 2977.5841836734694, 15.541326530612247, 75.9795918367347, 1.5765306122448979, nothing], Union{Nothing, Float64}[22.75, 4.0, 151.0, 93.5, 2803.5, 15.5, 76.0, 1.0, nothing], Union{Nothing, Float64}[7.8050074865717995, 1.7057832474527845, 104.64400390890465, 38.49115993282848, 849.4025600429493, 2.758864119188082, 3.6837365435778295, 0.8055181834183056, nothing]], DataFrames.Index(Dict(:std => 4,:variable => 1,:mean => 2,:median => 3), [:variable, :mean, :median, :std]))</ code > </ pre > </ div >
74
56
< p > To retrieve column names, you can use < code > names</ code > :</ p >
75
57
< pre > < code class ="julia hljs "> names(auto)</ code > </ pre > < div class =code_output > < pre > < code class ="plaintext hljs "> 9-element Array{Symbol,1}:
76
58
:MPG
@@ -118,7 +100,9 @@ <h2 id=plotting_data ><a href="/MLJTutorials/pub/isl/lab-2.html#plotting_data">P
118
100
119
101
</ ul >
120
102
< p > In these tutorials we use < code > PyPlot</ code > but you could use another package of course.</ p >
121
- < pre > < code class ="julia hljs "> figure(figsize=(< span class =hljs-number > 8</ span > ,< span class =hljs-number > 6</ span > ))
103
+ < pre > < code class ="julia hljs "> < span class =hljs-keyword > using</ span > PyPlot
104
+
105
+ figure(figsize=(< span class =hljs-number > 8</ span > ,< span class =hljs-number > 6</ span > ))
122
106
plot(mpg)
123
107
</ code > </ pre >
124
108
< p > < img src ="/MLJTutorials/assets/literate/ISL-lab-2-mpg.svg " alt ="" />
0 commit comments