Introduction
Step by step procedure :
Step 1 : Data parsing. The first step involves importing the required information from the data files (.txt). In practice, we need to extract the numeric data from the first (Y) and second (U) columns for the regression problem. This can be done in three different ways : 1. Copy and paste the data into your python code (as shown in the previous code block) -this solution is not recommended for large datasets-. 2. Remove the header from each file, leaving only numeric data in it, then import the data as a NumPy array - this solution is ok for large datasets but it is always a bad idea to manually modify data files, because you could accidentally end up modifying the data-. 3. Write a function to parse each file (since they all share the same data format) automatically. This last solution requires some advanced programming.
Step 2 : Preprocessing. The data contained in the data files are written in terms of y and not r. by inspecting your data you can see that the velocity is max at y close to R=0.45 and almost zero where y is close to 0. Therefore we need to apply a transformation to the original formula to find u(y) by substituting y = R-r into the equation for u(r). With this new formula we should be able to fit (that is, find the best values) for n and Umax using the experimental data from one data set. However...
Step 3 : Data preprocessing. ...The linear regression techniques do not apply here, because the formula of the velocity is a power law !!! By applying a proper mathematical transformation to the data you can make this nonlinear function linear again.
Step 4 : Linear Regression. Once you figure this out, you should be able to apply linear regression to a "transformed problem" to learn the best values for n and U_max . Repeat this procedure for all datasets except the Retau35k dataset. We will use this dataset to perform a leave-one-out validation. Basically we assume that we are "blind" to some of the data, and we seek for n and U_max without using that data. After finding appropriate values for n and U_max we can use them to predict the velocity for the case Re=35K and verify that the predicted velocities agree with the experimental observations.
Step 5 : Validation. Use the regression values to interpolate n and Umax at Reynolds = 35061, and predict the velocity profile for values of y in the file of reynold's number 35k. Compare the velocity profile you just found with the experimental data in column U, and compute the error.
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('classic')
Re = [5522, 9906, 21337, 31793, 39855]
def dat(location):
list_U_max = []
list_n = []
for l in range(len(location)):
list_u = []
list_y = []
for key in location[l]:
for q in range(len(key)):
list_u.append(key[q][1])
list_y.append(key[q][0])
for j in range(len(list_y)):
B = np.zeros((len(list_u),1))
A = np.zeros((len(list_y),2))
for k in range(len(list_y)):
B[k] = np.log(list_u[k])
A[k,0] = 1
A[k,1] = np.log(list_y[k]/R)
LHS = (np.transpose(A)).dot(A)
RHS = (np.transpose(A)).dot(B)
X = np.linalg.solve(LHS,RHS)
U_max = np.exp(X[0])
n_max = 1/X[1]
list_U_max.append(U_max.tolist())
list_n.append(n_max.tolist())
return list_U_max, list_n
def regression(list_, Re_):
list_a = np.array(list_)
U = np.zeros((len(list_a),))
Re_L = np.zeros((len(Re_),2))
for i in range(len(list_)):
U[i] = list_a[i]
Re_L[i,0] = 1
Re_L[i,1] = Re_[i]
A1 = (np.transpose(Re_L)).dot(Re_L)
B1 = (np.transpose(Re_L)).dot(U)
U_I = np.linalg.solve(A1, B1)
return U_I
R = 0.4500
data = [[np.loadtxt("/work/Files_for_Flow/Data/Retau_5k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')],[np.loadtxt("/work/Files_for_Flow/Data/Retau_10k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')],[np.loadtxt("/work/Files_for_Flow/Data/Retau_20k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')], [np.loadtxt("/work/Files_for_Flow/Data/Retau_30k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')], [np.loadtxt("/work/Files_for_Flow/Data/Retau_40k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')]]
values = list(dat(data))
for i in range(5):
print(f"At Re = {Re[i]}, U_max = {values[0][i]}, n = {values[1][i]}")
x_axis = np.linspace(5000, 45000, 10000)
y_axis_N = regression(dat(data)[1], Re)[0] + regression(dat(data)[1], Re)[1]*x_axis
y_axis_U = regression(dat(data)[0], Re)[0] + regression(dat(data)[0], Re)[1]*x_axis
fig, ax = plt.subplots()
ax.plot(Re,dat(data)[1], "o", label = "Reynolds_number vs n")
ax.plot(x_axis, y_axis_N, label = "Regression of Re and n")
plt.title("Regression curve to interpolate the n w.r.t Re")
plt.ylabel("n")
plt.xlabel("Reynold's number")
plt.legend(loc='upper left')
plt.show()
fig, bx = plt.subplots()
bx.plot(Re,dat(data)[0], "o", label = "Reynolds_number vs Umax")
bx.plot(x_axis, y_axis_U, label = "Regression of Re and Umax")
plt.title("Regression curve to interpolate the Umax w.r.t Re")
plt.ylabel("max.velocity (m/s)")
plt.xlabel("Reynold's number")
plt.legend(loc='upper left')
plt.show()
U_max_35k_I = regression(dat(data)[0], Re)[0] + regression(dat(data)[0], Re)[1]*35061
n_35k_I = regression(dat(data)[1], Re)[0] + regression(dat(data)[1], Re)[1]*35061
print(f"Regression Data: At 35K, U_max = {U_max_35k_I}, n = {n_35k_I}")
validation = [[np.loadtxt("/work/Files_for_Flow/Data/Retau_35k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')]]
print(f"Experimental Data: At 35K, U_max = {(dat(validation))[0]}, n = {(dat(validation))[1]}")
xy_I = np.loadtxt("/work/Files_for_Flow/Data/Retau_35k_basic_stats.txt", dtype= float, skiprows=6, comments = '#')
R = 0.45
list_y_I = []
list_u_I = []
list_u = []
for key in xy_I:
list_y_I.append(key[0])
for key in xy_I:
list_u.append(key[1])
for i in range(len(list_y_I)):
list_u_I.append(np.exp((np.log(U_max_35k_I)+(np.log(list_y_I[i]/R)/n_35k_I))))
x = 0
for i in range(len(list_u)):
x += (abs(list_u_I[i] - list_u[i]))
relative_error = (x/sum(list_u))*100
plt.plot(list_u_I, list_y_I, "ok", label = "Regression curve")
plt.plot(list_u, list_y_I, "*r", label = "Experimental curve")
plt.title("Regression curve vs Experimental curve for RE = 35K")
plt.ylabel("Distance from the wall to center of duct(m)")
plt.xlabel("Velocity(m/s)")
plt.legend(loc='upper left')
plt.show()
print(f'RELATIVE ERROR = {relative_error} %')