subject

You do not need to import any libraries or modules about K-means clustering because you will implement it from scratch. The template of the code is provided, and you just need to write your code at specified locations with "your code is here". Download the dataset ‘k_means_clustering_data. csv’ and save it into your working directory where we can find your source code about this homework. The dataset has two columns (‘x’ and ‘y’) and 42 records. They are 42 points in a 2D plane. Your goal is to group them into K clusters using K-means clustering algorithm. The basic step of k-means clustering is simple. Initially, we determine number of cluster K and select K centroid or center of these clusters from the dataset randomly. Then the K-means algorithm will iterate at the following steps until convergence. a. Update each centroid coordinate based on the data points in the cluster b. Measure the distance of each point in the dataset to the K centroids c. Group the point based on minimum distance this is the code provided please fill it in

def k_means_clustering(data, centroids, k):
centroid_current = centroids
centroid_last = pd. DataFrame()
clusters = pd. DataFrame()
data = pd. read_csv('k_means_clustering_data. csv')
data = [(float(x),float(y)) for x, y in data[['x','y']].values]
# iterate until convergence
while not centroid_current. equals(centroid_last):

cluster_count = 0 #it counts the number of clusters. Cluster IDs start from 0.
# calculate the distance of each point to the K centroids
for idx, position in centroid_current. iterrows():
# your code is here. Save the Euclidean distances into 'clusters'

# your code ends
cluster_count += 1

# update cluster, assign the points to clusters
clusterIDs = []
for row_idx in range(len(clusters)):
# your code is here. Check the distances at every row in 'clusters'. Save the assigned cluster IDs to points. The IDs start from 0

# your code ends
# assign points to clusters. The information is saved in the list and assigned to the dataset.
data['Cluster'] = clusterIDs

# store previous cluster
centroid_last = centroid_current

# Update the centroid of each cluster. All information are in 'data'. You have to calculate the new centroids based on the points in the same cluster.
# The centroid is the center of a list of points. For example, (x1, y1), (x2, y2), ..., (xn, yn). The centroid is (x, y), where x = the mean of (x1, x2, ..., xn) and y = the mean of (y1, y2, ..., yn).
centroids =[]
points= [] # save k lists of points in the list. The points in the same list are in the same cluster.
# your code is here. The K centroids will be saved in 'centroids', e. g. [[1, 2], [3, 4], [5, 6]]

# your code ends
centroid_current = pd. DataFrame(data=centroids, columns = ['x', 'y'])

print("No updates on clusters: ", centroid_current. equals(centroid_last))

print("Convergence! Final centroids:", centroid_current)
# plotting
print('Plotting...')
colors= ['b', 'g', 'r', 'c', 'm', 'y', 'k']

# scatter plot all points. All points are colored circles
for i in range(k):
p = np. array(points[i])
x, y = p[:,0], p[:, 1]

plt. scatter(x, y, color = colors[i])
plt. scatter(centroid_current['x'], centroid_current['y'], marker='^', color = colors[i])

# scatter plot all centroids. All points are colored triangles
for j in range(k):
plt. scatter(centroid_current. iloc[j][0], centroid_current. iloc[j][1], marker='^', color= colors[j])

plt. show()

And this is he data provided

x y
1 0
1 1
1 2
2 0
2 1
2 2
2 7
2 9
3 0
3 2
3 4
3 6
3 8
4 4
4 7
4 9
5 5
5 6
5 7
5 8
5 9
5 10
6 2
6 3
6 8
7 0
7 1
7 2
7 4
7 7
7 9
7 10
8 0
8 1
8 2
8 3
8 8
9 0
9 2
9 3
4 2
5 3

ansver
Answers: 1

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 16:20
It policy compliance and emerging technologies respond to the following: propose at least three control measures that organizations need to put in place to ensure that they remain complaint with emerging technologies and in a continually changing it environment. examine the correlation of effective configuration management and change control procedures to remain compliant with emerging technologies and it security changes.
Answers: 2
question
Computers and Technology, 22.06.2019 21:50
Answer the following questions regarding your system by using the commands listed in this chapter. for each question, write the command you used to obtain the answer. a. what are the total number of inodes in the root filesystem? how many are currently utilized? how many are available for use? b. what filesystems are currently mounted on your system? c. what filesystems are available to be mounted on your system? d. what filesystems will be automatically mounted at boot time?
Answers: 1
question
Computers and Technology, 23.06.2019 15:00
To check whether your writing is clear , you can
Answers: 2
question
Computers and Technology, 23.06.2019 17:00
The camera still is bad even with the new iphone xr and especially in low light it is even worst because you can see the pixels more if its in low light. if all you apple customers want apple to fix this then lets fill there feedback with complaints about the
Answers: 1
You know the right answer?
You do not need to import any libraries or modules about K-means clustering because you will impleme...
Questions
question
Mathematics, 12.05.2021 14:00
question
Social Studies, 12.05.2021 14:00
question
Mathematics, 12.05.2021 14:00
Questions on the website: 13722367