top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

40 Must-Know Python Concepts For Beginners in Data Science

About a decade ago, Python was probably not the top tool for doing Data Science. But the pace it has picked ever since has been really magnificent and the rest is history.


Python’s standard library alone contains a vast array of modules for working with data. Now add the extensive community efforts in the form of third-party libraries to make Python much more robust and you have a wide range of functionalities at your beck and call.



In this article, however, we will explore 40 concepts in the standard library for 40 different Python concepts that beginners need to know to propel their Data Science career.


Note that this is by no means an exhaustive list of the useful and commonly used Python concepts; the concepts listed here are as many as we have written about on this platform.


The Concepts List

To make the list a little easier to follow, we group the concepts into six categories, based on how closely related their use cases are. We also arrange them in the order in which they are to be learnt. Lastly, you can click the name of the concept to link to a short code example of how the concepts can be used.


The groups of concepts are: data types, operators, data structures, control flow, functions, and others generally useful concepts.

Click the concept name to view a code example

Data Types

Since all we do in data science is about data, a thorough knowledge of data types, and not just mere recognition, is needed to be good in Python. Below is a list of the basic data types in Python.


1. Integers and Floats: These data types are used for numeric

values. Integer works for discrete values while float works for

decimal values. They are both used extensively for computing and

the only way you can escape them is probably not to code at all


2. Strings: Strings represent text values and, if I had mentioned earlier that you can’t escape numeric data types, note that it’s even more serious with strings. Aside from using them as the data type we are operating on, strings are widely used for adding messages and formatting our results for better understanding. Because of its extensive use, it comes with a flurry of methods that can help us with manipulations. There is also a concentration that deals with strict formatting and can be found in the first link below.


3. Boolean: Evaluating whether expressions or conditions are true or false is a common thing when coding, and having a good knowledge of the Boolean data type will significantly add to your Python arsenal.


Operators

These are Python concepts used for performing operations on variables and values


4. Arithmetic: These are used for doing common mathematical operations. From simple addition down to modulus and exponentiation, these function would make your mathematical use of Python easy.


5. Assignment: It is tempting to think that the = sign is the only operator that is used for assigning values to variables. But delve a little deeper and you would see you can use it alongside some other symbols, particularly when you want to both assign and do some arithmetic simultaneously.


6. Comparison: As the name rightly suggests, these operators are used for comparing values and it is worth knowing if you are a beginner in Python.


7. Logical: As you proceed in your learning, you will definitely need to work with conditional statements in order to tell the program what you want and do not want. In this case, you will obviously need the power of logical operators like and and or. Find useful use cases in the links provided.


8. Identity: Though similar to comparison operators, these operators pay more attention to the object type of the values in question. With the is and is not, you can tell whether or a variable is same as another object-wise.


9. Membership: This helps us to check if a value or sequence of values are present in a given object. It uses the syntaxes in and not in.


Data Structures

These are data containers and are used for ordering and grouping different data types together.


10. Lists: These are probably the most common Python data container. They can hold different data types together. The elements in a list are ordered, changeable, and can be duplicated. Python has several built-in methods for manipulating lists. It is definitely a container every data professional cannot do without.


11. Tuples: Tuples are similar to list, only that their elements are unchangeable. Plus, they are often used to store related data types. They are also useful for doing multiple variable assignments.


12. Sets: These are data container for mutable, unordered collections of unique items. The unordered nature of sets allow for flexibility when storing and retrieving data. And like a list, a set has many methods that make working with it easy and intuitive.


13. Dictionaries: Unlike other data containers, dictionaries use a key- value pair for storing data. Dictionaries also come in very handy when you need to create compound data structures. What more, it is another way you can better comprehend json data format, which is one of the common data format data scientists work with.


Control Flow

These concepts build logic into codes by determining the sequence with which part of the code runs and which does not.


14. Conditional If Statement: This is used to specify whether to run or skip a block of code given a set condition is true or false. It is usually used with operators. It works with two optional clauses, elif and else. This is one of the most common control flow tools in Python. It simply checks if a condition is True and evaluates. If the condition is False, however, it evaluates based on the alternative preset route.


15. Boolean Expression: This is useful for case when you need more than one condition or need a mixture of operators and calculations in the if statement. In other words, it is employed for cases where we have complicated conditions.


16. For loop: This is used for iterating over data containers or iterables such as list, tuple, dictionary, etc., in a predefined number of times. It is a widely used concept and a sort of swiss knife you would need as a beginner.


17. While loop: Though similar to For loop, while loop does not use the definite iteration process. Instead it continues looping over an iterable for an unknown number of times until the preset condition is met. Instead a condition is set, and while the condition holds true, the statement under the while loop is executed.


18. Break, Continue, and Pass: There are times too when we need more control over a loop, like when it should end, skip certain iterations, or rather not even execute any iteration at all. The break, continue, and pass keywords serve these purposes respectively and can be a very handy collection of concepts in your day to day programming.


19. Zip and Enumerate: The zip and enumerate functions are also very useful for control flow in Python. With zip, you can output an iterator that combines more than one iterable into a sequence of tuples. In order words, instead of doing, say, three for loops on three different iterables, you can do for loop once by zipping the iterables together. After you’re done, you can then unzip each to the respective container as necessary. Also, there are times when you may also need the index of every item in your iterable alongside the value, then enumerate is what you need. The link provided contains the code to achieve that.


20. List comprehension: This is simply a one-liner version of the for loop. With list comprehension, you can quickly create lists from iterables. Another impressive thing about list comprehension is that, even though a one-liner, you can still add conditional statements just as in for loop.


Functions

These are the concepts that allow us carry out any form operation imaginable in Python. They are usually put in a holder and can be used repeatedly. Below are a few but useful examples for beginners.


21. User-defined function: The DRY principle of coding, (Don’t Repeat Yourself), is a critical part of writing effective code that will not just ease your work as a programmer but also the works of others with whom you collaborate with. While there are quite a number of ways to achieve this, Python function is one sure bet. With it you can create a single function or many functions wrapped up as a module and call it whenever it is needed. In addition, functions break down our work into smaller, organized and easy-to-manage chunks.


22. Lambda Function: When you are looking at a small, anonymous function that can be used as an expression instead of a statement, then what you actually need is a lambda function. Also it is better for cases where you are not really likely to need to reuse the function, like a quick fix. It can be used with other functions like map, reduce, and filter to do even more important things. The provided link takes you through how to use lambda with practical examples.


23. Map: Just returning iterators from an iterable may not be enough and you may want to apply some function to them first. Python map function is what you need. All you need is pass the function and iterable into the map function as arguments and you have your result. The provided link also gives an example where the map function can be more optimal than a for loop.


24. Eval: This function is simply used for carrying out mathematical operations on integers and floats. One particularly important aspect of this function is that it can be used to execute formulas that are stored in a string. It is super useful for beginners who wants some mathematical powers in Python


25. Range: The range function returns a sequence of numbers starting from zero. On its own it may not look particularly useful, but when used alongside other functions/concepts then its usefulness becomes bare. For example, it is usually employed when looping over an iterable.


26. Input: This function prompts users for input. It provides the simplest form of interactivity between your code and its users. It can bring out some sheer excitement in a beginner to see their not just communicating with a user but also producing the intended result!


Other Useful Concepts

The set of concepts do not have similar uses but cut across various parts of programming in Python and thus are generally useful.

27. *Args and **Kwargs: There are times when defining a function that you may not be sure of the number of arguments that are needed. Just add an asterisk before the parameter name in the function definition and voila!. Or if you are dealing with key arguments, just use double asterisk. You can go through the provided link to get more insight.


28. Classes: Python is an object-oriented programming language and knowing how to create your own object, however simple, is definitely a step towards being a self-reliant user of the language. The knowledge you are likely to gather defining your own attributes and methods and initiating instances of your class object will no doubt make your use of other built-in Python packages much more easier.


29. Inheritance: This is a concept that saves you the stress of having to create classes that have similar attributes and methods. Having created just one such class, you can make a new class just inherit the methods and properties of the first created class.


30. Iterators and Generators: Iterator is an object that can return data values one at a time and can start where it left off each time it is called and can be obtained from iterables like lists and tuples. However you may need to create a custom iterator directly and that’s where generators come. While these concepts may not be immediately needed by a beginner, having a prerequisite introduction to it isn’t a bad idea.


31. Local and Global Variables: because it is likely that you would use certain variables repeatedly, it is important to understand how the region or scope of the variable determines its availability in the course of your coding. The provided link will also take you through the essence of paying attention to the name you give variables when working between a local and a global scope.


32. Modules: Once you are familiar with creating user-defined functions, you can extend your knowledge to modules. Just put those functions in a .py file and you can easily import and use them in another program, unlike functions that you can only use in the same programming environment. It’s definitely a handy and easy tool to master.


33. Date and Time: Working with Date and time requires another set of technical know-how since this data has its own type away from the ones earlier discussed. Besides, it is common to have datasets that contain date and time and having this in your toolkit as a Python beginner is a big plus.


34. Regular Expression: When you work with strings, you would very likely need to parse certain words, phrases, and other important texts. Regular expressions, or RegEx for short, is the method of using sequences of characters to form a search pattern that will help you achieve this. It is a very useful concept that will ease your task with string data types.


35. Errors and Exceptions: You will have to debug errors. Did I just say will? You must debug errors. That’s just the life of a programmer. And having a knowledge of some of the most common errors and how to handle them will go a long way to make your Python journey smoother. Reduce the rate at which errors delay your work with the helpful links provided.


36. String Formatting: To make sure string variables run as expected, particularly when embedding non-string data types within and between them, this concept is a must-know. It is a very important concept in Python and used especially for making our output much more intuitive.


37. Indexing and Slicing: It is not every time we want to work with the whole dataset and sometimes we may just want to check or manipulate a subset of the data. That’s where indexing and slicing come in. They are a very common concept in Python for accessing elements of iterables such as list, tuple, string, etc. It is definitely a concept you need in your learning journey.


38. Reading/Importing File: As data is usually one of the first things in any Data Science project, having a method for reading data into your Python workspace is as essential as the word essential can go. From common file formats like txt, csv, xlsx, xml, html, json, pdf, hdf5 to software-based files from applications like Matlab, Sas, Strata, Python provides simple but effective methods for reading data in few lines of codes. The link above gives examples of how different data file formats can be imported in Python.


39. Writing file: Aside from reading an existing file, you can also write to an existing file, probably for some modification or update, and Python write() function does this seamlessly.


40. PIP: Well, you can see this as a bonus concept, but it is worth knowing if you want to extend your knowledge of Python to third-party packages.


Happy Coding!






1 comment

Recent Posts

See All
bottom of page