Comments on Salmon Run: An implementation of the Silhouette Score metric on Spark

Thanks, this was useful. Cleared up a few things f...

2021-02-27T11:46:24.307-08:00

Thanks, this was useful. Cleared up a few things for me.

Awesome news, thanks oskarryn! I will use that one...

2020-04-29T11:56:00.024-07:00

Awesome news, thanks oskarryn! I will use that one instead.

Hi, just for info: ClusteringEvaluator that comes ...

2019-06-12T03:34:43.538-07:00

Hi, just for info:
ClusteringEvaluator that comes with Spark 2.3 (released more or less when you published the post) has a scalable Silhouette implementation (https://issues.apache.org/jira/browse/SPARK-14516)

Here you have a PySpark implementation for the Sim...

2019-04-20T15:20:43.757-07:00

Here you have a PySpark implementation for the Simplified Silhouette Score.

I started from the data arrangement proposed for Scala.

Hope this helps.

Thanks, and no, I don't have a Python version,...

2018-03-14T13:04:52.905-07:00

Thanks, and no, I don't have a Python version, sorry. Although should be fairly easy to write (maybe easier because numpy has a cleaner API vs breeze IMO). I used Scala here mainly because you can inline code inside one of Spark's higher order functions, with Python you would need to write a separate function to compute the block that begins on line 40 and call it within the map call.

Thanks so much for this! Do you have a Python vers...

2018-03-13T19:52:47.915-07:00

Thanks so much for this! Do you have a Python version of this?

Thanks for the kind words, Anonymous.

2018-03-05T10:43:07.963-08:00

Thanks for the kind words, Anonymous.

Very shortly this website will be famous amid all ...

2018-03-05T08:31:52.919-08:00

Very shortly this website will be famous amid all
blogging visitors, due to it's nice articles or reviews