Python’s move towards being a language for data analysis has seen it copy many features from R, a language that was designed for dealing with data.
Such features include the dataframe, the statsmodels package
for building linear models, and the Python port of ggplot2. The latest addition
to this list is the PyDataset package, a resource modeled on the data sets that
come pre-packaged with R.
7 Mistakes to Avoid
in Machine Learning
It’s fairly straightforward to get started with Machine
Learning due to the availability of several superb open source APIs. However,
mastery in the subject can only be achieved by adding profundity to one’s
knowledge.
One such facet involves learning how to deal with the
assumptions and drawbacks of the various algorithms being used. In a post for
KDnuggets, Ex-Google engineer Cheng-Tao Chu goes into seven mistakes to avoid
for the aspiring Machine Learning expert.
Among his seven points, Chu talks about picking a suitable
evaluation metric for your model that fits the domain in which it is being
applied, being cognizant of and dealing with outliers carefully, and avoiding
models which tend to overfit when dealing with data where the number of features
outnumbers the number of data points.
Find this article useful?
Help others find it by sharing and commenting below.
Learn more about our Data Science Conference, speakers and
workshops. Hurry some discount tickets are still available.
No comments:
Post a Comment