Machine Learning Library MLlib Guide MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. At a high level, it provides tools such as:
ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering Featurization: feature extraction, transformation, dimensionality reduction, and selection Pipelines: tools for constructing, evaluating, and tuning ML Pipelines Persistence: saving and load algorithms, models, and Pipelines Utilities: linear algebra, statistics, data handling, etc.
Please feel free to contact me if you have any questions with this repo:)
使用Pyspark 实现几个机器学习的例子:
- 基于ALS 推荐算法 [完成]
- 使用Spark的分类器 [完成]
- 使用Spark的回归器 [完成]
- 使用Spark的ML库 [完成]
如果对代码仓有任何问题,欢迎联系我:)