Questions? You’ve come to the right place. Find answers here.

Loading a data set
You can load your data in multiple ways. The easiest method is by dragging your .csv file directly into the dashboard from your computer. Instantly, the data are sorted into the spreadsheet. Click 'Done' to finish loading your data. You also have the option via the select a .csv file to upload your files in the traditional manner by selecting the Load button in the left menu. You can also copy and paste your data from a spreadsheet and input into a blank scratch sheet, or load a spreadsheet from a Google Sheets URL.

Adding data points
Doubleclick anywhere in the graph to add a data point, and watch the numerical and graphical analysis instantly update! The old line of best fit is denoted by the dotted line, and the updated line of best fit is the solid one. The new data point will appear as a yellow dot in the place that you clicked, and Datasplash will show you the exact x and y values for it. Any new data points will remain yellow so that you can easily differentiate between those that you’ve added and ones that were in the original dataset.

Editing data points
To remove a data point, simply rightclick on it and select 'Delete'. Again, your graphical and numerical analysis will update automatically! Click and drag within the graph to highlight a subset of your data. Once you release your mouse, you’ll be prompted to either keep only the points that you’ve selected or delete the points that you’ve selected. As always, the graphical and numerical analysis will instantly update to reflect your choice! Doubleclick on any data point displayed in Datasplash’s scatterplot to give it a new label or to change the x or y values of that data point.

Graphing displays
The mean, median, minimum and maximum values of your data set can be displayed in the graph by clicking the Show Summary Statistics selection in the Graph Settings box (gear icon), located to the below right of the scatter plot.

Calculating correlation coefficients
Click on the Correlation tab, located above the graph. Mouse over any of the colored squares to see the values of the respective correlation coefficients.

Saving data
To save your graph, click on 'Save' in the left navigation. You can save the dataset to your account, download it, or save it as a .PNG or .SVG file.

Printing data
Click on the 'Print' icon in the left navigation, where you can decide whether you want to print the graph and the statistics from the tab you are currently on, or just the graph.

Sharing data
By clicking on the 'Share' icon in the left navigation, you will be prompted to share via email, by copying a Bit.lygenerated URL, or by grabbing a snippet of code to embed your graph.

Running a ttest of equality of means
Click on the 'Distributions' tab to test if the average values in two subsamples are different. By default, Datasplash uses the first two variables in your dataset that are suitable for this test  a dichotomous variable whose two values define the two subsamples and a continuous variable. To change either variable, click on the variable name and select a new variable from the dropdown menu. Datasplash automatically and instantly recalculates all the test statistics. Hover over the pvalue to see an interpretation of the test result. The graph within the 'Distributions' tab shows the estimated distributions for the two subsamples. The two vertical lines represent the means of the two subsamples. Click on the arrows to cycle through the variables in your data set.

Running a chisquare test of independence
Click on the 'CrossTab' tab to test if the proportions differ by subsample. By default, Datasplash uses the first two categorical variables in your dataset. To change either variable, click on the variable name and select a new variable from the dropdown menu. Datasplash automatically and instantly recalculates all the test statistics. Hover over the pvalue to see an interpretation of the test result. The graph within the 'CrossTab' tab shows a bar chart. Click on the arrows next to the variable names to cycle through the categorical variables in your data set.

Estimating a linear regression model
Click on the Equation tab, located above the graph, to view the linear regression model for your current data set.

Changing the dependent variable in my model
To change your dependent variable, hover over the yvariable shown on the vertical axis and click on the up or down arrows. The numerical results shown in the Equation, Regression Results, and Prediction tabs adjust immediately. You can also simply click on the button at the bottom called 'Change Model'. A modal window will appear that will allow you to choose your dependent and independent variable simply by clicking and dragging. You will also have the option of adding additional explanatory variables into your model  click 'Move All' to put any additional explanatory variables into your model, or click and drag them individually. Only the first explanatory variable on the list will be displayed in the graph.

Changing explanatory variables in my model
You can add explanatory variables into your model by clicking on the dependent variable and then click and dragging. Another option is the 'Add Explanatory Variable' button, found in the lower left hand corner of the screen  Any variables you add will change the numerical analysis. The bolded values represent the updated model, and the nonbolded values displayed below represent the old model. Click the same button to remove any additional variables. Once you’ve added explanatory variables to your model, you can cycle through them in your graph. Hover your mouse over either the explanatory or response variable currently displayed on the x or yaxis, respectively. Arrows will appear that allow you to change the variables displayed in the scatter plot.

Including categorical variables in my model
Datasplash recognizes categorical variables, as long as they contain no more than 5 distinct values. Simply upload your data set and watch Datasplash instantly display it graphically and run a regression that includes your categorical data! The regression will be run by leaving out one value of the categorical variable. This is done to avoid perfect collinearity (since all models include a constant term by default). Thus, the effect estimates of the values of the categorical variables included in the model are in reference to the omitted category. Datasplash lets you change the category your model omits. Simply click and drag (omitted category) to any of the other categorical variables located on the x axis. The model’s numerical results will update automatically.

Checking for statistically significant coefficients
Take a look at the pvalues for each coefficient, located in the 'Equation' tab of Datasplash. Mouse over the values to see an intuitive interpretation! Additionally, 95% confidence intervals are displayed for all the explanatory variables you include in your model. Mouse over them, and a text box will appear explaining the level of statistical significance for each coefficient.

Finding the Rsquared of my model
The Rsquared value of your model can be found in the 'Equation' tab of Datasplash.

Finding the value and pvalue of the Ftest
The value of your Ftest and the pvalue of that test can be found on the Equation tab, below the Rsquared value.

Saving the residuals and predicted values of my model
To save the residuals and predicted values of your model, hover the mouse over 'Save', and click 'Download Dataset'. Datasplash will download an csv (commaseparated values) file of your data, with new columns, labeled Residual and Prediction.

Running outofsample predictions
Click on the 'Prediction' tab. By default, Datasplash prepopulates the text boxes with the mean values of the explanatory variables but you can enter any value for your explanatory variables. Datasplash will show you the predicted value of the response variable.