{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multifunctional Graph Plotter\n",
"### Ewan Miles - 06/05/2020\n",
"\n",
"**This code is entirely open-source and thus editable by any user.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This program is designed to automatically plot a graph by unpacking data from a table in the same file location (i.e. if this program is under _C:/Desktop_, the datatable must be under _C:/Desktop_ also). It requires certain inputs from the user, including\n",
"* the **filename**;\n",
"* the **formatting** of each series, with a name for each series in the **legend**;\n",
"* the **axis scale type** for $x$ and $y$;\n",
"* the **titles** for each axis and the graph.\n",
"\n",
"The program will also give you the option to save the figure to the file location if it displays the data as desired. Note that the tables should be in **_.csv_** format, as this allows for data unpacking. Furthermore, make\n",
"sure the data is in either float (**decimal point**) or integer format.\n",
"\n",
"Please arrange the data so that columns follow as such: $x\\space data, x\\space uncertainty, y\\space data, y\\space uncertainty$. Four columns as such constitutes a **_series_**. Place series next to each other in the adjacent four columns. Below is an example template of how to format your tables:\n",
"\n",
"$$\\mathbf{Series 1\\kern 25em Series 2}$$\n",
"$$x\\space data \\kern 3em x\\space uncertainty \\kern 3em y\\space data \\kern 3em y\\space uncertainty \\kern 2em | \\kern 2em x\\space data \\kern 3em x\\space uncertainty \\kern 3em y\\space data \\kern 3em y\\space uncertainty$$\n",
"\n",
"If you do not wish to include horizontal error bars, for the purpose of plotting using this program, set your $x \\space uncertainties$ equal to 0. Set the $y \\space uncertainties$ equal to 0 if you do not wish to include vertical error bars. If the data has no uncertainty, leave both as 0 and learn to use a proper experimental method!\n",
"\n",
"Before the program proceeds with plotting the graph, it will print arrays of the unpacked tables to test whether it is the data you desire. This will happen in a separate cell labelled **Print data to check**; if the data is correct you can run the following cells to plot the graph. It _will not print all of the data_ if it is in more than 5 series in order to save space in the console and keep the code running smoothly; it will instead print the first two and last two series."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"### IMPORTING MODULES, DEFINING FUNCTIONS ###\n",
"\n",
"%matplotlib notebook\n",
"#Allowing interactive plots\n",
"\n",
"import numpy as np #Maths module\n",
"import matplotlib.pyplot as plt #Plots graphs\n",
"import ipywidgets as wdg #Interactive sliders, radio buttons, etc\n",
"import scipy.stats as stats #Gaussian fits, etc \n",
"\n",
"def dataprint(var,start,end):\n",
" \"\"\"\n",
" Function which iterates through datasets to print out series of data (used to check against data), inputs:\n",
" - var: Unpacked dataset (e.g. [5,7,5,3,2],[5,7,89,8,87])\n",
" - start: Startpoint for iterating through data (e.g. first series number)\n",
" - end: Endpoint for iterating through data (e.g. last series number)\n",
" \"\"\"\n",
" for n in range(start,end):\n",
" print(\">>> Series {0}:\".format(n+1))\n",
" for j in range(4*n,4*(n+1)):\n",
" print(var[j])\n",
" \n",
"def fullset(var):\n",
" \"\"\"\n",
" Function which iterates through data series to create one full dataset including all datapoints\n",
" Used for plotting fitted curves/lines, does not affect original data array/matrix, input:\n",
" - var: Dataset already unpacked into array/matrix (e.g.[5,6,7,8],[0.1,0.1,0.1,0.2],...) \n",
" \"\"\"\n",
" #Construct empty datasets to fill over iteration\n",
" xdata = []\n",
" ydata = []\n",
" xerr = []\n",
" yerr = []\n",
" k = 0 #Iteration variable\n",
" while k < len(var):\n",
" xdata.append(var[k]) #Append x dataset\n",
" k += 1\n",
" xerr.append(var[k]) #Append x-error dataset\n",
" k += 1\n",
" ydata.append(var[k]) #Append y dataset\n",
" k += 1\n",
" yerr.append(var[k]) #Append y-error dataset\n",
" k += 1\n",
" return xdata, ydata, xerr, yerr\n",
"\n",
"def residuals(degree, p, dy):\n",
" \"\"\"\n",
" Function that calculates the residuals, squared residuals and sq residual sum for a polynomial\n",
" fit to data, which can be used to calculate the reduced chi^2 value, inputs:\n",
" - degree: Order of fitted polynomial (e.g. quadratic degree = 2)\n",
" - p: Coefficients of fitted polynomial found using np.polyfit\n",
" - dy: Arrays of y-uncertainty data ONLY from the full dataset\n",
" NOTE: It will attempt to unpack the datafile as defined at the top of this notebook, check those\n",
" variables (e.g. data, line, seriesno, etc.) have not been redefined as something else\n",
" Outputs numpy array of [residuals, square residuals, sq residual sum]\n",
" \"\"\"\n",
" #Gather separate arrays of all xpoints and ypoints\n",
" xunpack = np.loadtxt(data, delimiter=\",\", skiprows=line, usecols=(list(i for i in range(0,4*seriesno,4))), unpack=True, encoding=\"UTF-8\")\n",
" yunpack = np.loadtxt(data, delimiter=\",\", skiprows=line, usecols=(list(i for i in range((0*seriesno+2),4*seriesno,4))), unpack=True, encoding=\"UTF-8\")\n",
"\n",
" ### THIS SECTION OF CODE COVERS THE POSSIBILITY THAT THE x VALUES IN EACH COLUMN MAY NOT BE THE SAME\n",
" #Gather distinct values of x from array (columns may not be the same)\n",
" xtrimmed = np.unique(xunpack)\n",
"\n",
" #Empty array for mean y points\n",
" meanypts = np.array([])\n",
"\n",
" for i in xtrimmed:\n",
" xpos = np.where(xunpack == i) #Find occurrences of distinct x in full x points array\n",
" ybar = np.mean(yunpack[xpos]) #Mean the corresponding y points in y array\n",
" meanypts = np.append(meanypts, ybar)\n",
"\n",
" powers = np.linspace(degree,0,degree+1)\n",
"\n",
" ylinepts = []\n",
" for i in xtrimmed:\n",
" term = 0\n",
" for j,n in zip(powers,p):\n",
" term += n*(i**j)\n",
" ylinepts.append(term)\n",
"\n",
" residuals = []\n",
" for y,ybar,unc in zip(ylinepts,meanypts,dy):\n",
" residuals.append((y-ybar)/unc)\n",
" rsq = np.square(residuals)\n",
" rsum = np.sum(rsq)\n",
" \n",
" return np.array([residuals,rsq,rsum])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"### SEARCH FOR FILE BASED ON USER INPUT ###\n",
"\n",
"#Decode csv for and define variable to unpack later\n",
"encoded = True #Loop for checking if filename is correct\n",
"while encoded == True:\n",
" csv = str(input(\"What is the name of the data file? It must be a .csv file. Do not include the .csv extension, or 'quotes': \"))\n",
" data = \"{0}.csv\".format(csv)\n",
" try:\n",
" datatable = open(data, encoding=\"UTF-8-sig\") #Try opening the datafile as variable\n",
" encoded = False #If successful break loop\n",
" except FileNotFoundError: #If unsuccessful ask user to retype file name\n",
" print(\"Sorry, the file could not be found. Check your input.\")\n",
" continue\n",
"\n",
"firstline = datatable.readlines(1) #Unpack first line of data\n",
"\n",
"#Loop to count each instance of comma delimiter, add 1 series to count for each 4 instances of delimiter\n",
"for i in firstline:\n",
" count = 1\n",
" while len(i) != 0:\n",
" loc = i.find(\",\") #Cycle through and count each instance of delimiter\n",
" if loc == -1:\n",
" break #Break from loop if delimiter not present\n",
" count += 1\n",
" i = i[loc+1:] #Slice previous delimiter from string, re-iterate\n",
" continue\n",
" seriesno = int(count/4)\n",
"\n",
"#Loop to find first row with floats, which is starting row of data\n",
"line = 1\n",
"for i in datatable.readlines():\n",
" comma = i.find(\",\")\n",
" i = i[:comma] #Slice to first item in line\n",
" try:\n",
" float(i) #Attempt to make float, if not, row is not data\n",
" break\n",
" except ValueError:\n",
" line += 1\n",
"\n",
"datatable = open(data, encoding=\"UTF-8-sig\") #Open the datafile again as reading lines clears variable\n",
"\n",
"#Unpacking data and creating variables for plotting the graph\n",
"var = [i for i in range(4*seriesno)]\n",
"var[:] = np.loadtxt(datatable, delimiter=\",\", skiprows = line, unpack=True, encoding=\"UTF-8\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Print data to check\n",
"\n",
"The cell below will give you the option to print the unpacked data to check whether the operation has worked successfully. It is not required that you run the cell, it is entirely optional. Reasons for not checking include very large datasets, or no way of checking against the csv file.\n",
"\n",
"As printing 200 series of data is a waste of time and space, the cell will print four series only. If the dataset is four series or less, it will print all series, but if it is five or more, it will print **the first two series and the last two series only**. It is useful to check against smaller datasets, to make sure no points have been missed by the functions above.\n",
"\n",
"If your data is not being printed out correctly, check that the csv file fits all specifications for the unpacking in the text cell at the top."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Print dataset for user to check it is correct\n",
"if seriesno < 5:\n",
" dataprint(var,0,seriesno) #Only print full set if 4 series or fewer\n",
"\n",
"else:\n",
" dataprint(var,0,2)\n",
" print(\"\\n.......\\n\")\n",
" dataprint(var,seriesno-2,seriesno) #If more than 4 series, print first two and last two series\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Title, x axis and y axis\n",
"xlabel = str(input(\"What would you like to label the x axis?: \"))\n",
"ylabel = str(input(\"What would you like to label the y axis?: \"))\n",
"title = str(input(\"What would you like to title the graph?: \"))\n",
"\n",
"#Radio button widget choices for gridlines\n",
"gridradio = wdg.RadioButtons(\n",
" options=[\"Yes\",\"No\"], #Options for gridlines\n",
" description=\"Gridlines?\",\n",
" disabled=False\n",
")\n",
"\n",
"gridradio #Call the selection widget"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scale types\n",
"\n",
"For certain types of graphs, it can be advantageous to plot a logarithmic scale or other types of scale, which are available within `matplotlib`. These are presented in radio button choices below; they are declared as variables later on in the script for plotting the graph."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Scale types for each axis\n",
"print(\"There are multiple types of scale available for axes in matplotlib. These include log, linear, symlog and logit.\")\n",
"print(\"To learn more about scale types, reading is available at https://matplotlib.org/gallery/pyplots/pyplot_scales.html#sphx-glr-gallery-pyplots-pyplot-scales-py\")\n",
"\n",
"#Radio button widget choices for x axis\n",
"xaxisradio = wdg.RadioButtons(\n",
" options=[\"linear\",\"log\",\"symlog\",\"logit\"], #Options for scale type\n",
" value=\"linear\", #Default selected scale type\n",
" description=\"$x$ Axis Scale:\",\n",
" disabled=False\n",
")\n",
"\n",
"xaxisradio #Call the selection widget"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Radio button widget choices for y axis\n",
"xscale = xaxisradio.value\n",
"yaxisradio = wdg.RadioButtons(\n",
" options=[\"linear\",\"log\",\"symlog\",\"logit\"], #Options for scale type\n",
" value=\"linear\", #Default selected scale type\n",
" description=\"$y$ Axis Scale:\",\n",
" disabled=False\n",
")\n",
"\n",
"yaxisradio #Call the selection widget"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Graph Markers\n",
"\n",
"Although purely stylistic, `matplotlib` offer different shapes with which you can cast your data. There is a link in the code below to a website which shows most of the markers `matplotlib` have to offer. The string input below is the way to choose the marker. If you want the data to be joined point to point, add \"-\" after the shape selection in the string (e.g. joined up with stars would be \"*-\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Defining variables for the scale types\n",
"xscale = xaxisradio.value\n",
"yscale = yaxisradio.value\n",
"\n",
"#User input for dataseries shapes on plot\n",
"print(\"In which style would you like the data series to be cast? There are options you can find here: https://matplotlib.org/api/markers_api.html\")\n",
"marker = str(input(\"Enter your shape of choice, as a default the data will be connected by lines between datapoints: \"))\n",
"\n",
"#User input for labels for each series\n",
"labels = []\n",
"j = 0 #Iteration variable\n",
"while j < seriesno:\n",
" lbl = str(input(\"Input the label for Series {0} of your data: \".format(j+1))) #Name of series\n",
" labels.append(lbl)\n",
" j += 1\n",
"\n",
"#Plotting the data\n",
"plt.figure()\n",
"if gridradio.value == \"Yes\": #Check radio selection for gridlines, plot if wanted\n",
" plt.grid(True)\n",
"\n",
"#Loop through variable sets plotting each\n",
"index = 0\n",
"while index < seriesno:\n",
" plt.errorbar(var[4*index], var[2+(4*index)], xerr=var[1+(4*index)], yerr=var[3+(4*index)], label=labels[index], fmt=marker)\n",
" index += 1\n",
" \n",
"plt.xlabel(xlabel) #x axis label\n",
"plt.ylabel(ylabel) #y axis label\n",
"plt.xscale(xscale) #x scale type\n",
"plt.yscale(yscale) #y scale type\n",
"plt.title(title) #Title\n",
"plt.legend(loc=\"best\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### SAVE CELL: The code below saves the figure with any name you give it.\n",
"\n",
"**Do not run the cell if you do not wish to save the figure - the only escape to the input field is to kill the kernel.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Save figure to local directory with name chosen by user\n",
"name = str(input(\"Name the graph. End it with .pdf for a PDF, .png for a PNG, etc. The default is a PNG:\"))\n",
"plt.savefig(name)\n",
"print(\"It has been saved to the file location (local directory) of this python program.\") "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fitted straight line\n",
"\n",
"The code below will plot your data once again, but with a weighted straight line fit. This includes calculating a weighted gradient and $y$-intercept. If there are multiple series of data, expands the x and y datasets by combining each series' elements and uses all of the points to construct the line.\n",
"\n",
"The following equations are used in constructing a weighted line of best fit:\n",
"* The weight given to each point, $w_i$:\n",
"$$\\normalsize{w_i=\\frac{1}{(Δy_i)^2}}$$\n",
"* The equation for the weighted $m$:\n",
"$$\\normalsize{m=\\frac{Σw_iΣw_ix_iy_i-Σw_ix_iΣw_iy_i}{δ}}$$\n",
"* The equation for the weighted $c$:\n",
"$$\\normalsize{c=\\frac{Σw_ix_i^2Σw_iy_i-Σw_ix_iΣw_ix_iy_i}{δ}}$$\n",
"* The denominator, $\\delta$ in both equations for $m$ and $c$:\n",
"$$\\large{δ=Σw_iΣw_ix_i^2-(Σw_ix_i)^2}$$\n",
"\n",
"Their uncertainties, $\\Delta m$ and $\\Delta c$ are given by:\n",
"$$\\normalsize{Δm=\\sqrt{\\frac{Σw_i}{δ}}}$$\n",
"\n",
"$$\\normalsize{Δc=\\sqrt{\\frac{Σx_i^2w_i}{δ}}}$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Construct dataset including all points if multiple dataseries, for line of best fit to cover all data\n",
"if seriesno != 1:\n",
" xdata, ydata, xerr, yerr = fullset(var)\n",
"else:\n",
" xdata = var[0]\n",
" xerr = var[1]\n",
" ydata = var[2]\n",
" yerr = var[3]\n",
" \n",
"#Point weights, δ denominator, slope and intercept in order, as in equations above\n",
"w = 1/(np.square(yerr)) \n",
"delta = (np.sum(w)*np.sum(w*np.square(xdata))) - np.square(np.sum(w*xdata))\n",
"m = ((np.sum(w)*np.sum(w*xdata*ydata))-(np.sum(w*xdata)*np.sum(w*ydata)))/delta\n",
"c = ((np.sum(w*np.square(xdata))*np.sum(w*ydata))-(np.sum(w*xdata)*np.sum(w*xdata*ydata)))/delta\n",
"\n",
"#Uncertainties also as above\n",
"uncm = np.sqrt(np.sum(w)/delta)\n",
"uncc = np.sqrt(np.sum(np.square(xdata)*w)/delta)\n",
"\n",
"#Create arrays for plotting the line\n",
"xpoints = np.linspace(np.min(xdata),np.max(xdata),200)\n",
"ypoints = (m*xpoints) + c\n",
"\n",
"#Print slope and intercept with uncertainty for user (unrounded)\n",
"print(\"Value of the gradient (m):\\n{0} ± {1} \".format(m,uncm))\n",
"print()\n",
"print(\"Value of the y-intercept (c):\\n{0} ± {1}\".format(c,uncc))\n",
"\n",
"#Plotting the data\n",
"plt.figure()\n",
"if gridradio.value == \"Yes\": #Check radio selection for gridlines, plot if wanted\n",
" plt.grid(True)\n",
"\n",
"#Loop through variable sets plotting each\n",
"index = 0\n",
"while index < seriesno:\n",
" plt.errorbar(var[4*index], var[2+(4*index)], xerr=var[1+(4*index)], yerr=var[3+(4*index)], label=labels[index], fmt=marker)\n",
" index += 1\n",
" \n",
"plt.plot(xpoints,ypoints,\"k-\",label=\"Weighted Line Fit\") #Line of best fit\n",
"plt.xlabel(xlabel) #x axis label\n",
"plt.ylabel(ylabel) #y axis label\n",
"plt.xscale(xscale) #x scale type\n",
"plt.yscale(yscale) #y scale type\n",
"plt.title(title) #Title\n",
"plt.legend(loc=\"best\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### SAVE CELL"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Save figure to local directory with name chosen by user\n",
"name = str(input(\"Name the graph. End it with .pdf for a PDF, .png for a PNG, etc. The default is a PNG:\"))\n",
"plt.savefig(name)\n",
"print(\"It has been saved to the file location (local directory) of this python program.\") "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fitted polynomial curve\n",
"\n",
"The code below will plot your data once again, but with a polynomial curve fit. The `numpy` module has this capability, and allows the user to fit a polynomial curve with a user-defined order using the function `np.polyfit`. It is able to determine appropriate coefficients from the data, along with a matrix of covariance. Again, if there are multiple dataseries, it will expand the dataset to include all series under one array of $x$, $x\\space uncertainty$, $y$ and $y\\space uncertainty$.\n",
"\n",
"### Reduced $\\chi^2$: Automatic or User-chosen?\n",
"\n",
"Choosing an appropriate order of polynomial to fit the data is important, and can be measured using something called the reduced $\\chi^2$. This should be around 1 for a good fit; **the lowest order polynomial** with $\\chi^2$ approaching 1 should be used, otherwise the curve is often said to be 'overfitted'. \n",
"\n",
"This program offers a pseudo-automatic route. It will automatically fit multiple polynomials to the data and calculate a reduced $\\chi^2$, running the user through 10 orders of the polynomial. \n",
"\n",
"It will output a graph mapping the reduced $\\chi^2$ after execution. From there, you can make the decision to use a longer or shorter polynomial based on the change in $\\chi^2$ as the order $n$ increases; if $\\Delta y$ is accurate, $\\chi^2$ should plateau around 1, but may not. In this case the user's best judgement is required.\n",
"\n",
"To calculate the reduced $\\chi^2$ value for a curve fit, the code will calculate the residuals of the datapoints from the curve, and from there use the following equations to reach a result:\n",
"\n",
"* The residual of any point, $d_i$:\n",
"$$\\normalsize d_i=y_{line}-y_i$$\n",
"\n",
"$$(for\\space all\\space points\\space y_i)$$\n",
"\n",
"* The degrees of freedom within the fit, $\\nu$, where $n_{coefficients}$ is the number of coefficients in the fitted polynomial:\n",
"\n",
"$$\\normalsize \\nu = n_{points}-n_{coefficients}$$\n",
"\n",
"* The reduced $\\chi^2$:\n",
"\n",
"$$\\normalsize{\\chi^2=\\frac{\\sum ({\\frac{d_i}{\\Delta y_i}})^2}{\\nu}}$$\n",
"\n",
"where $\\Delta y_i$ is the experimental y uncertainty."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"#Construct dataset including all points if multiple dataseries, for line of best fit to cover all data\n",
"if seriesno != 1:\n",
" xdata, ydata, dxdata, dydata = fullset(var)\n",
" #polyfit doesn't cooperate with array of np arrays, refilling lists of x, dx, y, dy\n",
" x = []\n",
" y = []\n",
" dx = []\n",
" dy = []\n",
" \n",
" #x variables\n",
" for i in xdata:\n",
" for j in np.nditer(i):\n",
" x.append(float(j))\n",
" \n",
" #y variables\n",
" for i in ydata:\n",
" for j in np.nditer(i):\n",
" y.append(float(j))\n",
" \n",
" #dx variables\n",
" for i in dxdata:\n",
" for j in np.nditer(i):\n",
" dx.append(float(j))\n",
" \n",
" #dy variables\n",
" for i in dydata:\n",
" for j in np.nditer(i):\n",
" dy.append(float(j))\n",
"else:\n",
" x = var[0]\n",
" dx = var[1]\n",
" y = var[2]\n",
" dy = var[3]\n",
"\n",
"order = np.arange(1,10) #Range of order values for looping\n",
"chiset = np.array([])\n",
"\n",
"for degree in order:\n",
" p, v = np.polyfit(x,y,degree,cov=True) #Polynomial coefficients, Matrix of covariance\n",
" resids, rsq, rsum = residuals(degree,p,dy)\n",
" npoints = len(x)\n",
" ncoeffs = len(p)\n",
" dof = npoints - ncoeffs #Degrees of freedom\n",
"\n",
" #Chi calculations, append each one to array for plotting\n",
" chi = rsum/dof\n",
" chiset = np.append(chiset, chi)\n",
"\n",
"plt.figure()\n",
"plt.plot(order,chiset,\"k-\")\n",
"plt.xlabel(\"Polynomial order $n$\")\n",
"plt.ylabel(\"$\\chi^2$ value\")\n",
"plt.title(\"Change in $\\chi^2$ fit value as order $n$ of\\nfitted polynomial increases\")\n",
"\n",
"#Return change in chi at each point\n",
"dchi = chiset[:-1] - chiset[1:]\n",
"print(\"Change in chi at each n: {0}\".format(dchi))\n",
"\n",
"#Suggest chi value based on change in chi\n",
"for c,n,d in zip(chiset,order,dchi):\n",
" if d < 1:\n",
" print(\"\\nSuggested polynomial fit order: {0}\".format(n))\n",
" print(\"Chi^2 value for this polynomial: {0:0.3f}\".format(c))\n",
" print(\"Choose your desired order below.\")\n",
" break\n",
" \n",
"#Allow user to input polynomial order for plot\n",
"user_order = wdg.BoundedIntText(\n",
" value=1,\n",
" min=1,\n",
" max=9,\n",
" step=1,\n",
" description=\"Order $n$:\",\n",
" disabled=False\n",
")\n",
"\n",
"user_order"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"#Calculate polynomial fit based on user-chosen order\n",
"degree = user_order.value\n",
"p, v = np.polyfit(x,y,degree,cov=True)\n",
"poly = np.poly1d(p) #line function for plotting fitted curve\n",
"\n",
"xpoints = np.linspace(x[0],x[len(x)-1],len(x)) #x value array\n",
"ypoints = poly(xpoints) #y value array using poly1d\n",
"\n",
"res, rsq, rsum = residuals(degree,p,dy)\n",
"\n",
"print(\"FOR POLYNOMIAL FIT OF ORDER: {0}\".format(degree))\n",
"print()\n",
" \n",
"#Printing coefficients with error\n",
"for j in range(np.size(p)):\n",
" print(\"The coefficient of order x^\", len(p)-j-1, \" is \", p[j], \" with error \", np.sqrt(np.diag(v))[j])\n",
" print()\n",
" \n",
"npoints = len(x)\n",
"ncoeffs = len(p)\n",
"dof = npoints - ncoeffs #Degrees of freedom\n",
"print(\"The degrees of freedom:\", dof)\n",
"\n",
"chi = rsum/dof\n",
"print(\"Reduced chi^2:\", chi)\n",
"print(\"\\n\\n\")\n",
"\n",
"#Plotting the data\n",
"plt.figure()\n",
"if gridradio.value == \"Yes\": #Check radio selection for gridlines, plot if wanted\n",
" plt.grid(True)\n",
"\n",
"#Loop through variable sets plotting each\n",
"index = 0\n",
"while index < seriesno:\n",
" plt.errorbar(var[4*index], var[2+(4*index)], xerr=var[1+(4*index)], yerr=var[3+(4*index)], label=labels[index], fmt=marker)\n",
" index += 1\n",
" \n",
"plt.plot(xpoints,ypoints,\"k-\",label=\"Polynomial fit, order {0}, $\\chi^2 =$ {1:0.3f}\".format(degree, chi)) #Line of best fit\n",
"plt.xlabel(xlabel) #x axis label\n",
"plt.ylabel(ylabel) #y axis label\n",
"plt.xscale(xscale) #x scale type\n",
"plt.yscale(yscale) #y scale type\n",
"plt.title(title) #Title\n",
"plt.legend(loc=\"best\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### SAVE CELL"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Save figure to local directory with name chosen by user\n",
"name = str(input(\"Name the polynomial fit graph. End it with .pdf for a PDF, .png for a PNG, etc. The default is a PNG: \"))\n",
"plt.savefig(name)\n",
"print(\"It has been saved to the file location (local directory) of this python program.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Residual Distribution\n",
"\n",
"Plotting the distribution of the residuals calculated above and fitting a Gaussian curve allows the user to assess the fit of the polynomial. The mean $x_0$ should be around 0, while the standard deviation $\\sigma$ should be approximately within the range of the uncertainty $\\Delta y_i$ from the data used."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gx = np.linspace(-np.max(np.abs(res)),np.max(np.abs(res)),50) #Range over which Gaussian is fitted\n",
"x0, sigma = stats.norm.fit(res) #Mean and stdev for Gaussian\n",
"gaussian = stats.norm.pdf(gx,x0,sigma) #Gaussian curve\n",
"\n",
"plt.figure()\n",
"# 15 bins, normalized:\n",
"plt.hist(res,bins=15,density=True,edgecolor='k') #Plot histogram\n",
"plt.plot(gx,gaussian,'r-', label=\"Gaussian Fit\") #Plot Gaussian fit\n",
"plt.title(\"Distribution of residuals from the polynomial fit\")\n",
"plt.xlabel(\"Residual Size (Corrected for random error)\")\n",
"plt.ylabel(\"Normalised Occurences\") \n",
"plt.legend(loc=\"best\");\n",
"\n",
"print(\"\\nMean residual value:\", x0, \"\\nStandard Deviation:\", sigma)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### SAVE CELL"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Save figure to local directory with name chosen by user\n",
"name = str(input(\"Name the graph. End it with .pdf for a PDF, .png for a PNG, etc. The default is a PNG: \"))\n",
"plt.savefig(name)\n",
"print(\"It has been saved to the file location (local directory) of this python program.\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}