Skip to content

scripting

Jeremy Faden edited this page Mar 2, 2020 · 80 revisions

Purpose: Long awaited refresh of scripting documentation.

Introduction to Scripting

Autoplot uses the programming language Jython to provide scripting. This is like Python, using the same syntax, but is based on the Java Virtual Machine rather than C-based libraries. With scripting, the scientist can solve many common problems. Some are simple, like we want to see what the ratio of two data sets looks like. Others are more complex, like creating plots for each day of a ten-year mission. There is no limit to what can be done with scripting, because all of Java is available along with the libraries available in Autoplot.

Autoplot's data handling library, QDataSet, is used to represent data. This is used much like NumPy arrays, but along with data, these arrays carry metadata like axis labels, timetags, and units. Autoplot is used to easily load data into this environment using the "getDataSet" command and data URIs. For example, consider this script:

ds1= getDataSet( 'http://autoplot.org/data/image/Capture_00158.jpg?channel=greyscale' )
ds2= getDataSet( 'http://autoplot.org/data/image/Capture_00159.jpg?channel=greyscale' )
result= abs( ds2 - ds1 )

It should be fairly clear what the script does. It loads in two grayscale views of images, and then assigns the absolute difference to the variable "result." This variable could then be plotted, for example, or we might look for the greatest difference.

Script Editor

Autoplot provides a editor GUI for working with scripts. Selecting Options→Enable Feature→Script Panel will reveal a tab named "script" where scripts may be entered and executed. The editor provides simple completions for this environment. To see completions, enter TAB or ctrl-space.

The editor also has a control for the context where the script will be run. You may know already that there are two kinds of scripts: application context and data source context. Data source context scripts know only what's needed to load data. These might be called by a server and used to send data to a remote client. These don't have the command "plot," for example, because the Autoplot application isn't running--just its libraries. The other context is the application context, and this has access to everything, including references to the running application. These can load dataset URIs making plots, respond to mouse events, and create new plot rendering types.

Data Source Context

Here scripts can load in data and return a new dataset (or datasets). This is the "data source context" and these are files that can be used as if they were datasets. If the above script were saved in the file "http://autoplot.org/data/imageDiff.jyds," then http://autoplot.org/data/imageDiff.jyds?result would refer to this dataset. (Note result is what's plotted by default.) This allows scientists to publish operations done to data as well as data itself. Note these scripts are unaware of the Autoplot application, they can load and operate on data, but they cannot plot it. Commands available in this context are described below under the section "Ops."

Script examples: https://sourceforge.net/p/autoplot/code/HEAD/tree/autoplot/trunk/JythonDataSource/src/ and http://autoplot/cookbook#Scripting.

These scripts are saved with the extension ".jyds" and like all Autoplot data can reside on a web site so that anyone can use them from any where.

Application Context

These scripts access the application itself. Take the following:

trs= generateTimeRanges( '$Y-$m-$d', '2010-January' )
for tr in trs:
   plot( 'vap+cdaweb:ds=AC_H0_MFI&id=Magnitude&timerange='+tr )
   writeToPng( '/tmp/ap/%s.png' % tr )

This script would run the application through each day of the month January 2010, making images of each day. All commands are available in this context.

Examples are available at https://sourceforge.net/p/autoplot/code/HEAD/tree/autoplot/trunk/Autoplot/src/scripts/, http://autoplot/cookbook#Scripting, and https://github.com/autoplot/dev/

This javadoc, https://ci-pw.physics.uiowa.edu/job/autoplot-javadoc/lastSuccessfulBuild/artifact/doc/org/autoplot/ScriptContext.html, describes the additional commands available in this Jython Application Context. Note the symbol "dom" is also available in Application Context scripts, and is the state of the application canvas.

Application context scripts are saved with the extension ".jy" and scripts need not be saved to disk to run.

Getting Started

This document hopes to introduce scripting sufficiently that the scientist will be enabled to understand scripts encountered and write scripts on their own.
There are hundreds of commands available, and this section starts with a number of commonly used commands to introduce scripting.

Name
Description
Example
getFile retrieve a file to make it local to the machine f='http://autoplot.org/data/14MB_1.qds'
print(getFile(f),monitor).length())
getDataSet load the URI into memory u='http://autoplot.org/data/14MB_1.qds'
ds=getDataSet(u,monitor)
ds.putProperty set a property for the data ds.putProperty(QDataSet.TITLE,'14 MB of data')
plot plot the data set plot(ds,ytitle='14MB')
trim trim the data to the indeces or range tr='2000-01-19T01:00/2000-01-20T10:00'
ds=trim(ds,datumRange(tr))
where return indeces where the condition is true r=where(ds.gt(1e5))
sum return the sum of the data print(sum(ds[where(ds.gt(1e5))]))
reduceMax return the maximum value found in rows of the dataset plot(reduceMax(ds,1))
synchronize synchronize data to the same timetags by finding nearest neighbors (ds1,ds2)=synchronize(ds0,(ds1,ds2))

You can try each of these commands interactively in the log console at the "AP>" prompt, or in the script panel. If your console is not enabled, activate it with [menubar]→Options→Enable Feature→Log Console. You will also want to enable the script panel as well.

Here is an example script you can try:

ds=getDataSet('http://autoplot.org/data/14MB.qds',monitor)
r=where(ds.gt(1e5))
plot(ds[r],title='My Data')

The first line loads the data into the variable "ds" using the variable monitor to provide a way to get progress feedback. When a script is run, a number of variables are already defined, like monitor. The variable ds is a QDataSet, which is like an array in any other language, but can also carry metadata and also contains the time tags.

The second line uses the "where" command, which is much like the IDL where command, to return a list of indices showing where the condition is true. Note that ds is a two-index dataset, or "rank 2" dataset, containing measurements for each time and energy of the dataset. The variable "r" is also rank 2, but the first index is for the count of true positions, and the second index is for the number of indices. So ds[r[0,0],r[0,1]] is the first value where the condition is true, and ds[r[-1,0],r[-1,1]] is the last.

The third command plots the subset of the data where the condition is true, using "My Data" for the title of the plot.


Tab completions are useful for looking up command documentation.

Building Scripts

Use of Progress Monitor

All scripts have a progress monitor that can be used to provide feedback to the script's invoker. This is the variable 'monitor' and is used like so:

monitor.setTaskSize(200)                     # the number of steps (arbitrary units)
monitor.started()                            # the task is started  

d=getFile('http://autoplot.org/data/14MB.qds',monitor.getSubtaskMonitor(0,100,'load file'))
for i in xrange(100):                        # xrange(200) iterates through 0,1,...,198,199
   if ( monitor.isCancelled() ): break       # check to see if the task has been cancelled.
   monitor.setProgressMessage('at %d' % i)   # this describes actions done to perform the task.  
   monitor.setTaskProgress(i+100)
   sleep(100)                                # sleep for 100 milliseconds

monitor.finished()     # indicate the task is complete

A well-written script will use the monitor to effectively convey information to the person running it. Imagine the scientist is the CEO of a company, and the script is the Manager of a process. The process is implemented by a Worker. All three parties use the progress monitor. The Worker calls setProgressMessage and setTaskProgress to convey the state of the task. The Worker checks isCancelled to see if they can abort the process. The Manager calls setLabel to convey the overall goal of the progress. Typically Autoplot will play the role of the Manager, setting the label "loading data" or "executing script", but it is acceptable for the script to take on this role as well, since it can be more descriptive.

Adding Controls

Controls are added to the script using the "getParam" call. This call will get numbers, strings, URIs, and other types, using named parameters. Default values must be specified as well, and a single-line label may be used to describe the control. These controls are usually provided to the scientist using the script as a Java GUI control, but using a call like this allows scripts to be called from the command line with command line arguments, or from a web server where HTML is used to create the controls.

s= getParam( 's', 'deflt', 'label to describe' )   # gets a string parameter, with default value "deflt"
f= getParam( 'f', 2.34, 'label to describe' )      # gets a float parameter, with default value 2.34
i= getParam( 'i', 100, 'array size' )              # gets an integer parameter.  
f= getParam( 'f', 100.0, 'volume' )                # gets a float parameter. 
e= getParam( 'e', 'RBSPA', 'spacecraft', [ 'RBSPA', 'RBSPB' ] )   # enumeration with the values given
b= getParam( 'v', 'F', 'apply correction', [ 'T', 'F' ] )         # booleans are just enumerations with the values 'T' and 'F'
l= getParam( 'l', 0, 'threshold', 
             { 'labels':['all','<20','20-50','>50'], 'values':[0,1,2,3] } ) # labels provided for each value of enumeration.

Autoplot will look for this in scripts and automatically add to GUI. The type is determined by the default value. Types are shown above, and there are also "Datums" which are a physical quantity (like 5Hz), "DatumRanges" which span a range from min datum to max datum, often used for time ranges, "URIs" which locate data, and "URLs" which locate web resources. Note files are not a type, and URLs can be used for this purpose with the "file:" prefix.

This is mostly used for .jyds scripts that create new datasets, but all script types can use this command. The jyds plugin creates a GUI by simplifying the script to just getParam calls and trivial commands.

Enumerations are supported as well, where a list of possible values is enumerated. For example:

sensor= getParam( 'sensor', 'left', 'sensor antenna', ['left','right'] )

will get the parameter sensor, which can be either left or right (with left as the default). When a GUI is created, a droplist of possible values is used instead of a text entry field.

Last, booleans are allowed, and a checkbox is used when a GUI is produced:

correct= getParam( 'correct', 'T', 'perform correction on the data', [ 'T', 'F' ] )
if correct=='T': doCorrection()

Note you cannot use the result as a boolean in the Jython code. You must compare it to 'T'. With Autoplot 2020, you can finally use booleans to avoid this problem:

correct= getParam( 'correct', True, 'perform correction on the data', [ True, False ] )
if correct: doCorrection()

Application-context scripts can use getParam as well. When the execute button is pressed the default values are used, but when shift-Execute is pressed a dialog for controlling the parameters is shown. Scripts can be run from the command line as well, and would look like:

java -cp autoplot.jar org.autoplot.AutoplotUI --script /tmp/myscript.jy sensor=right correct=T

Exceptions and Logging

Often a process will typically execute, but we know exceptional cases may occur and we want to invoke special code to handle them. For example, we may open a sequence of files, and if one is misformatted, we want to log it to a file and carry on with processing other files.

In Jython we use try/exception blocks, something like:

try: 
    plot( uri )
    writeToPng( '/tmp/pngs/%s.png' % uri )
except Exception, ex:
    ERROR.write( '# unable to plot ' + uri + ' because of ' + str(ex) )

and sometimes you need to throw an exception, indicating a strange case the script has encountered:

if ds.length()==0:
    raise Exception('Dataset is empty')

Logging is useful for providing feedback to the script developer when things are broken but suppressing information when it's not needed.

from java.util.logging import Logger,Level
from org.das2.util import LoggerManager
logger= LoggerManager.getLogger('myScript')
logger.setLevel(Level.FINE) # Normally this is set to INFO, meaning only informational messages are printed.
logger.info('loading data')    # the script invoker is interested in these messages
logger.fine('reading record')  # only the developer is interested in this fine detail.
Using Loggers is useful, because you can set the level to Level.FINE when debugging code, but then leave it at Level.INFO when using it for science. The console also has a useful control for setting log levels.

Symbols Automatically Defined

name description
monitor progress monitor, which can be ignored or passed to slow function.
dom object representing the state of the current application
PWD directory containing the script, a URL like http://autoplot.org/data/tools/
or file:///home/user/analysis/

The Autoplot DOM contains the state of the application. This is a tree-like structure containing the list of plots and their positions on the canvas, the data URIs, and the plot elements with references to the data to read and plot on to which the data will be rendered. Any DOM property can be set, for example try dom.plots[0].title='My Data' or dom.plotElements[0].style.color=Color.RED.

DOM refers to a running application, and you will often see references containing "controller" like in dom.plots[0].controller.isPendingChanges(). The plot's controller node is added to manage the run-time data needed for managing the plot. This also provides access to the Das2 objects which are used to implement the plot, and additional functionality is provided by these. (Das2 is the graphics library used to create Autoplot.) Last the controller node may provide functions useful in scripts like dom.controller.addConnector(dom.plots[0],dom.plots[1]).

PWD is allowed in scripts to refer to folder containing the script. This is a string containing the URI location of the script, be it on a local hard drive or a remote web folder. This will always end with a slash, and will often start with "http://" or "file://". The script editor tab allows a script to be composed without saving it to a file. In this case, PWD is not defined.

More Utility

Scripts have a central role in Autoplot. While Autoplot makes it easy to plot data directly from files, often a bit more "massaging" of the data is needed before it's useful for analysis. That first use case is just one of many fulfilled by scripts.

Scripts allow additional functionality to added to the application. For example, PNGWalk generation was once just a script. Scripts allow new functionality to be added to Autoplot by anyone, not just Java developers who have commit access to Autoplot's source code. Note that when a script is run, Autoplot provides an option to add it to the Tools menu, so that it's easily run. Note this also will run the script without any security checks, scripts referenced here are trusted for use.

Scripts can also be used to create simple web applications. For example, https://jfaden.net/AutoplotServlet/ScriptGUIServlet shows how a script is read by the server to create a web GUI and the image resulting from running the script is displayed alongside the controls. Groups interested in setting up such a server should contact Autoplot's developers.

Simple commands can be created as well, using the command-line interface. For example, download the script from https://jfaden.net/AutoplotServlet/ScriptGUIServlet and then run at the command line:

/usr/bin/java -jar /home/jbf/bin/autoplot.jar --headless --testPngFilename=/tmp/foo.png --script=demoParams.jy nc=4 noise=0.01

Autoplot's "Run Batch" tool takes a script and generates parameter states and runs the script against each state. For example, suppose you want to detect an error condition in your data. You could write a script that detects if one day's data contains the condition, then use the Run Batch tool to test the entire mission.

Scripts are useful for testing as well. The Autoplot testing server contains hundreds of scripts which verify corners of its functionality which aren't tested using .vap files. Often scripts are provided by others which demonstrate a bug, and these scripts are incorporated into the testing environment to ensure the bug doesn't return.

Relation Between .jy, .jyds, and .vap Files

With two types of Jython scripts running around, it can be confusing as to what is used where. A .jy script can do everything Java and Autoplot can do, and can have references to .vap files. These extend Autoplot's functionality to perform some science task. For example, a .jy script can load a .vap and create a screenshot. It can also contain a "getDataSet" call to a .jyds file as well. A .vap file can contain references to a .jyds file (to load data), but it will not be able to use a .jy script.

Using GitHub (and GitLabs) to Store and Serve Scripts

Autoplot tries to work "in the web" so that people at different institutions can share vap files that grab files from web sites and no one thinks about where to download data before getting to analysis. To facilitate this, special support for GitHub and GitLabs instances has been added, and scripts can be served from these, along with .vap files.

Autoplot does this by interacting with a FileSystem interface. Whether it's a local directory or a web directory, it all looks the same to Autoplot as it requests files for plotting. (Remote files are downloaded and stored in the autoplot_data/fscache area, transparently.) GitLabs instances are appealing for this use because they are widely used, easily understood, and can be edited directly. (Note GitHub is a GitLabs instance.) Two scientists can collaborate using the same script, making changes which are trivially seen by both.
Autoplot supports GitLabs instances, knowing for example "raw" must be added to the URL to download the file.

Presently GitLabs instances are not detected automatically, and within the code of Autoplot there is a list of the few supported instances. This will soon be resolved, but for now groups must request by email that a GitLabs instance be added. Also, only public projects are supported. This too may change, as groups request the feature.

Examples of .jy scripts

Autoplot is able to load resources from GitHub, and one area has been set up to contain hundreds of example scripts. The GitHub site https://github.com/autoplot/dev/bugs/ contains codes which demonstrate bugs (which have been fixed). https://github.com/autoplot/dev/demos/ and https://github.com/autoplot/dev/rfe/ contain small but complete scripts which demonstrate commands. Note these scripts typically have dates for names, but they are easily searched. (See the upper-left search box, as of January 2020.)

Miscellaneous notes

Note the completions use a trick to work, and that is to refactor the script to an equivalent script which can be executed immediately. This works by reducing the script to just imports, constructor calls, and to a number of routines which are known to be quick. Of course this doesn't always work, but provides pretty good results. However, because of this, some parts of the code can not support completions, like callbacks from Java code (def boxSelected). It's also possible that a constructor call is very slow, which would hang completions, but this assumption has been effective. Last, there may be some side effects that occur as well, like GUIs created. The completions are getting more attention than before as many more people are using the editor, and this code is maturing.

Table Of Contents

URIs that Point to Data Files

Download a CDF and Plot it with Autoplot

Load a CDF directly from a website

URIs that Point to Data Servers

Saving to vap files

Loading vap files

Data Sources

CDF Files

HDF/NetCDF Files

Aggregation

CDAWeb

HAPI Servers

Exporting Data

Export Types

Additional controls

Aggregation

Tools

PNGWalk Tool

Data Mash Up

Events List

Run Batch

Advanced Topics

TimeSeriesBrowse and other Capabilities

Events Lists

Caching

Autoranging

Managing Autoplot's Data Cache

Using Autoplot with Python, IDL, and Matlab

Reading data into Python

Reading data into IDL

Reading data into Matlab

QDataSet Data Model

Clone this wiki locally