categories
  • question
    Loading a data set
    Mask

    You can load your data in multiple ways. The easiest method is by dragging your .csv file directly into the dashboard from your computer. Instantly, the data are sorted into the spreadsheet. Click 'Done' to finish loading your data. You also have the option via the select a .csv file to upload your files in the traditional manner by selecting the Load button in the left menu. You can also copy and paste your data from a spreadsheet and input into a blank scratch sheet, or load a spreadsheet from a Google Sheets URL.

  • question
    Adding data points
    Mask

    Double-click anywhere in the graph to add a data point, and watch the numerical and graphical analysis instantly update! The old line of best fit is denoted by the dotted line, and the updated line of best fit is the solid one. The new data point will appear as a yellow dot in the place that you clicked, and Datasplash will show you the exact x and y values for it. Any new data points will remain yellow so that you can easily differentiate between those that you’ve added and ones that were in the original dataset.

  • question
    Editing data points
    Mask

    To remove a data point, simply right-click on it and select 'Delete'. Again, your graphical and numerical analysis will update automatically! Click and drag within the graph to highlight a subset of your data. Once you release your mouse, you’ll be prompted to either keep only the points that you’ve selected or delete the points that you’ve selected. As always, the graphical and numerical analysis will instantly update to reflect your choice! Double-click on any data point displayed in Datasplash’s scatterplot to give it a new label or to change the x or y values of that data point.

  • question
    Graphing displays
    Mask

    The mean, median, minimum and maximum values of your data set can be displayed in the graph by clicking the Show Summary Statistics selection in the Graph Settings box (gear icon), located to the below right of the scatter plot.

  • question
    Calculating correlation coefficients
    Mask

    Click on the Correlation tab, located above the graph. Mouse over any of the colored squares to see the values of the respective correlation coefficients.

  • question
    Saving data
    Mask

    To save your graph, click on 'Save' in the left navigation. You can save the dataset to your account, download it, or save it as a .PNG or .SVG file.

  • question
    Printing data
    Mask

    Click on the 'Print' icon in the left navigation, where you can decide whether you want to print the graph and the statistics from the tab you are currently on, or just the graph.

  • question
    Sharing data
    Mask

    By clicking on the 'Share' icon in the left navigation, you will be prompted to share via email, by copying a Bit.ly-generated URL, or by grabbing a snippet of code to embed your graph.

  • question
    Running a t-test of equality of means
    Mask

    Click on the 'Distributions' tab to test if the average values in two subsamples are different. By default, Datasplash uses the first two variables in your dataset that are suitable for this test - a dichotomous variable whose two values define the two subsamples and a continuous variable. To change either variable, click on the variable name and select a new variable from the drop-down menu. Datasplash automatically and instantly recalculates all the test statistics. Hover over the p-value to see an interpretation of the test result. The graph within the 'Distributions' tab shows the estimated distributions for the two subsamples. The two vertical lines represent the means of the two subsamples. Click on the arrows to cycle through the variables in your data set.

  • question
    Running a chi-square test of independence
    Mask

    Click on the 'CrossTab' tab to test if the proportions differ by subsample. By default, Datasplash uses the first two categorical variables in your dataset. To change either variable, click on the variable name and select a new variable from the drop-down menu. Datasplash automatically and instantly recalculates all the test statistics. Hover over the p-value to see an interpretation of the test result. The graph within the 'CrossTab' tab shows a bar chart. Click on the arrows next to the variable names to cycle through the categorical variables in your data set.

  • question
    Estimating a linear regression model
    Mask

    Click on the Equation tab, located above the graph, to view the linear regression model for your current data set.

  • question
    Changing the dependent variable in my model
    Mask

    To change your dependent variable, hover over the y-variable shown on the vertical axis and click on the up or down arrows. The numerical results shown in the Equation, Regression Results, and Prediction tabs adjust immediately. You can also simply click on the button at the bottom called 'Change Model'. A modal window will appear that will allow you to choose your dependent and independent variable simply by clicking and dragging. You will also have the option of adding additional explanatory variables into your model - click 'Move All' to put any additional explanatory variables into your model, or click and drag them individually. Only the first explanatory variable on the list will be displayed in the graph.

  • question
    Changing explanatory variables in my model
    Mask

    You can add explanatory variables into your model by clicking on the dependent variable and then click and dragging. Another option is the 'Add Explanatory Variable' button, found in the lower left hand corner of the screen - Any variables you add will change the numerical analysis. The bolded values represent the updated model, and the non-bolded values displayed below represent the old model. Click the same button to remove any additional variables. Once you’ve added explanatory variables to your model, you can cycle through them in your graph. Hover your mouse over either the explanatory or response variable currently displayed on the x- or y-axis, respectively. Arrows will appear that allow you to change the variables displayed in the scatter plot.

  • question
    Including categorical variables in my model
    Mask

    Datasplash recognizes categorical variables, as long as they contain no more than 5 distinct values. Simply upload your data set and watch Datasplash instantly display it graphically and run a regression that includes your categorical data! The regression will be run by leaving out one value of the categorical variable. This is done to avoid perfect collinearity (since all models include a constant term by default). Thus, the effect estimates of the values of the categorical variables included in the model are in reference to the omitted category. Datasplash lets you change the category your model omits. Simply click and drag (omitted category) to any of the other categorical variables located on the x axis. The model’s numerical results will update automatically.

  • question
    Checking for statistically significant coefficients
    Mask

    Take a look at the p-values for each coefficient, located in the 'Equation' tab of Datasplash. Mouse over the values to see an intuitive interpretation! Additionally, 95% confidence intervals are displayed for all the explanatory variables you include in your model. Mouse over them, and a text box will appear explaining the level of statistical significance for each coefficient.

  • question
    Finding the R-squared of my model
    Mask

    The R-squared value of your model can be found in the 'Equation' tab of Datasplash.

  • question
    Finding the value and p-value of the F-test
    Mask

    The value of your F-test and the p-value of that test can be found on the Equation tab, below the R-squared value.

  • question
    Saving the residuals and predicted values of my model
    Mask

    To save the residuals and predicted values of your model, hover the mouse over 'Save', and click 'Download Dataset'. Datasplash will download an csv (comma-separated values) file of your data, with new columns, labeled Residual and Prediction.

  • question
    Running out-of-sample predictions
    Mask

    Click on the 'Prediction' tab. By default, Datasplash pre-populates the text boxes with the mean values of the explanatory variables but you can enter any value for your explanatory variables. Datasplash will show you the predicted value of the response variable.