Python for Beginners ~ Part 1 ~

Python is free and one of the most popular programming languages. You can find that python is simple, flexible, readable, and with rich functions. Python can use external libraries so that we can utilize it in a wide range of fields. Especially, the most popular field is in machine learning and deep learning.

Through some posts, we will see the basics of python from scratch. Let’s take a look at the basic concepts and grammar, keeping in mind the use in data science. Python runs a script line by line. So Python will tell you where you went wrong. So, don’t be afraid to make a mistake. Rather, you can make more and more mistakes and write the code from small stacks with small modifications.

In this post, we will learn the following contents.

  • Variables
  • Comment “#”
  • Arithmetic operations
  • Boolean
  • Comparison operator
  • List

Learning programming is boring as it is basic. However, creative working awaits beyond that !!

Note) The sample code in this post is supposed to run on Jupyter Notebook.

Variables

A variable is like a box for storing information. We assign a name to a variable, such as “x” and “y”. And, we put a value or a string into the variable. Let’s take a glance.

We first put “1” into “x“. And, check the content by the print() function. The “print()” function displays the content on your monitor.

x = 1
print(x)

>>  1

Similarly, strings can also be stored in variables. Strings are represented by ” “.

y = "This is a test."
print(y)

>>  This is a test.

Variables have information of data type, e.g. “int”, “float”, and “str”. “int” is for integer and “float” are for decimal. And, “str” is for a string. Let’s check it by the type() function.

type(x)  # x = 1
type(y)  # y = "This is a test."

>>  int
>>  str

z = 1.5
type(z)

>>  float

Python has the functions to convert variables into “int”, “float”, or “str” types. These functions are “int()”, “float()”, and “str()”.

a = 10

type( int(a) )
>>  int

type( float(a) )
>>  float

type( str(a) )
>>  str

Here, some basic rules for variable name are introduced. It’s okay if you know the following, but see the official Python documentation for more details.

Rules of variable name

  • It is case sensitive.
  • Only the alphabet, numbers, and “_” can be used.
  • The initial letter starts with a number.

For example, “abc” and “ABC” are distinguished. “number_6” is OK, but “number-6” is prohibited. The frequent mistake is including “-“, where “-“ is prohibited. Besides, “5_abc” is also prohibited. The initial letter must be numbers or “_”. However, some variables starting from “_” are for Python itself so that you can’t use them. Therefore, the author highly recommends that variables start from an alphabet, especially for beginners.

Comment “#”

Note that “#” is for comments which are NOT executed by python. Python ignores the contents after “#”. For example as the following one, python regards “print(“Hello world!”)” as a code. The other contents, such as “This is for a tutorial of Python.” and “print the first message!”, are regarded as a comment.

# This is for a tutorial of Python.
print("Hello world!")  # print the first message!

There is another method for comments. Python recognizes as a comment when the statement is between “”” and “””. In the following example, the sentence of “This is a comment.” is skipped by Python.

"""
   This is a comment.
"""
print("test")

>>  test

A comment is so important because we forget what the code we wrote in the past was for. Therefore, programmers leave the explanations and their thought as the comments. Concise and clear. That is very important.

Arithmetic operations

Here, let’s see simple examples of arithmetic operations. Complex calculations are just constructed by simple arithmetic operations. So, most programmers do NOT treat complex mathematics but utilize combinations of simple operations.

OperatorDescription
+addition
subtraction
*multiplication
/division
**power
2 + 3  # addition
>>  5

9 - 3  # subtraction
>>  6

5 * 20  # multiplication
>>  100

100 / 20  # division
>>  5

2**3  # power
>>  8

Boolean

Boolean is just True or False. It is, however, important because we construct an algorithm by controlling True or False. For example, we assume the following situation. When an examination score is over 70, it is passed. On the other hand, when the score is less than 70, it is not passed. To control the code by syntax, a boolean is needed.

The concept of a boolean may be unfamiliar to beginners, however, python tells us intuitively. The example below is that the result of judging x > 0(or x < 0) is assigned to the variable “boolean”.

x = 1   # Assign 1 to the variable "x"

boolean = x > 0
print(boolean)

>>  True

boolean = x < 0
print(boolean)

>>  False

There is no problem with the above code. However, the author recommends the following style due to readability. In the following style, we can clearly understand what is assigned to the variable “boolean“.

boolean = (x > 0)

Comparison operator

Here, let me Introduce comparison operators related to Boolean. We have already seen examples such as “<” and “>”. The typical ones we use frequently are listed below.

OperatorDescription
>[A > B] A is greater than B.
<[A < B] A is less than B.
>=[A >= B] A is greater than or equal to B.
<=[A <= B] A is less than or equal to B.
==[A == B] A equals to B.
!=[A != B] A does NOT equal B.

Examples against numerical values.

1 > 2
>>  False

5 < 9
>>  True

5 >= 5
>>  True

10 <= 8
>>  False

10 == 10
>>  True

10 != 10
>>  False

Examples against strings. Recall that Python is case sensitive.

"apple" == "APPLE"
>>  False

"apple" != "APPLE"
>>  True

List

List is for a data structure, which has a sequence of data. For example, 1, 2, 3,.., we can treat as a group by a list. The list is represented by “[]”, and let’s see the example.

A = [1,  2,  3]
print(A)

>>  [1, 2, 3]

“A” is the list, which stores the array of 1, 2, and 3. Each element can be appointed by the index, e.g. A[0]. Note that, in Python, an index starts from 0.

A[0]

>>  1

It is easy to replace, add or delete any elements. This property is called “mutable”.

Let’s replace the 2 in A[1] with 9. You will see that A[1] will be rewritten from 2 to 9.

A[1] = 5  # A = [1,  2,  3]
print(A)

>>  [1, 5, 3]

We can easily add a new element to the end of the list by the “.append()” method. Let’s add 99 to the list “A”.

A.append(99)  # A = [1,  5,  3]
print(A)

>>  [1,  5,  3,  99]

Of course, it is easy to delete an element. We do it by the “del” keyword. Let’s delete element 3 in A[2]. We will see that the list “A” will change from “[1, 5, 3, 99]” to “[1, 5, 99]”.

del A[2]  # A = [1,  5,  3,  99]
print(A)

>>  [1, 5, 99]

One more point, the list can handle numerical variables and strings together. Let’s add the string “apple” to the list “A”.

A.append("apple")
print(A)

>>  [1, 5, 99, 'apple']

Actually, this is the thing, where python is different from other famous programming languages such as C(C++) and Fortran. If you know such languages, please imagine that you first declare a variable name and its type. Namely, an array(“list” in Python) can treat only a sequence of data whose variable type is the same.

This is one of the reasons that Python is said to be a flexible language. Of course, there are some disadvantages. One is that the processing speed is slower. Therefore, when performing only numerical operations, you should use a so-called NumPy array, which handles only arrays composed of the same type of variables. NumPy is introduced in another post.

We can easily get the length of a list by the “len()” function.

len(A)  # A = [1, 5, 99, 'apple']

>>  4

Finally, the author would like to introduce one frequent mistake. Let me generate a new list, which is the same for the list A.

B = A  # A = [1, 5, 99, 'apple']
print(B)

>>  [1, 5, 99, 'apple']

B has the same elements as A. Next, let’s replace the one element of B.

B[3] = "orange"  # B[3] = "apple"
print(B)

>>  [1, 5, 99, 'orange']

We have confirmed that the element of B[3] is replaced from “apple” to “orange”. Then, let’s check A too.

A

>>  [1, 5, 99, 'orange']

Surprisingly, A is also changed! Actually, B was generated by “B = A” so that B refers to the same domain of memory as A. We can avoid the above mistake with the code below.

A = [1, 5, 99, 'apple']
##-- Generate the list "B"
B = A.copy()
B[3] = "orange"  # B[3] = "apple"

print(A) # Check the original list "A"

>>  [1, 5, 99, 'apple']

We have confirmed that the list A was NOT changed. When we copy the list, we have to use the “.copy()” method.

Summary

We have seen the basics of Python with the sample codes. The author would be glad if the reader feels Python is so flexible and intuitive. Python must be a good choice for beginners.

In the next post, we will see the other contents for learning the basics of Python.

GitHub Beginner’s Guide for Personal Use

Git, a version control system, is one of the essential skills for programmers and software engineers. Especially, GitHub, a version control service based on Git, is becoming the standard skill for such engineers.

GitHub is a famous service to control the version of a software development project. At GitHub, you can host your repository as a web page and keep your codes there. Besides, GitHub has many rich functions and makes it easier to manage the version of codes, so it is so practical for a large-scale project management. However, on the other hand, personal use is a little higher for beginners because of its peculiarity concepts, such as “commit” and “push”.

In this post, we will see the basic usage of GitHub, especially the process of creating a new repository and pushing your codes. It is intended for beginners. And after reading this post, you will keep your codes at GitHub, making your work more efficient.

What is GitHub?

Git, a core system for GitHub, is an open-source tool to control the version of a system. Especially, the function of tracking changes among versions is so useful, making it easier to run a software development project as a team.

GitHub is a well-known service using Git. Roughly speaking, GitHub is a platform to manage our codes and utilize the codes someone has written. We can manage not only individual codes but also open-source projects. Therefore, many open-source projects in the world are published through GitHub. The Python library you are using may also be published through GitHub.

The basic concept of GitHub is to synchronize the repository, like a directory including your codes, between your PC and the GitHub server. The feature is that we synchronize not only the code but also the change records. This is why GitHub is a powerful tool for developing as a team.

Try Git

First of all, if you have NOT installed Git, you have to install it. Please refer to the Git official site.

When Git is successfully installed, you can see the following message after the execution of the “git” command on the terminal or the command prompt.

git

>>  usage: git [--version] [--help] [-C <path>] [-c <name>=<value>]
>>             [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
>>             [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]
>>             [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
>>             <command> [<args>]
>>  
>>  These are common Git commands used in various situations:
>>  ...
>>  ...

Git command

Here, we will use the Git command. The basic format is “git @@@”, where “@@@” is each command such as “clone” and “add”. In this article, we will use just 5 commands as follows.

git clone
git status
git add
git commit -m “<comment> “
git push

For personal use, these five commands are all you need. Each command wii be explained below.

Create a New Repository

First, you create a new repository for your project. A repository is like a folder of your codes. It is easy to create a new repository on your GitHub account.

1. You visit the GitHub site and log in to your account. If you don’t have your account, please create.

2. Go to the “Repositories” tab, and click the “New” button.

3. Fill in the necessary information, “Repository name”, “Public or Private”, and “Initialized files”.

Note that, in”Public or Private” at , “Private” is for paid members. If it’s okay to publish it worldwide like a web page, select “Public”.

Whether you check “Add a README file” depends on you. The “README” file is for the description of your project. Of course, you can manually add the “README” file later.

Clone the Repository

The “clone” command synchronizes the local repository at your PC with the repository at GitHub. You can clone with just only the URL of your repository at GitHub.

1. Click the green button of “Code”().

2. Copy the URL of the HTTPS tab. You can copy by clicking the log at . Note that the default setting is for the HTTPS tab.

3. Execute the following command at the working directory on your terminal.

git clone <URL>

When the clone is done successfully, the directory, whose name is the same as the repository name, has been created. The version history of the repository is stored in the “.git” directory. “.git” is the hidden directory, so you can’t see it on your terminal by the “ls” command. You have to use the “ls -a” command. “-a” is the option for hidden files and directories.

Confirm the “Status”

First of all, we have to specify the files to synchronize with the repository on GitHub. Create a new script “sample.py” on the directory you cloned. For example, we can create it with the “touch” command.

touch sample.py

Next, use the “git add” command to put the target file in the staging state. Before executing the “git add” command, let’s confirm the staging condition of the file by the “git status” command.

git status

>>  On branch master
>>  Your branch is up to date with 'origin/master'.
>>  
>>  Untracked files:
>>    (use "git add <file>..." to include in what will be committed)
>>  
>>  	sample.py
>>  
>>  nothing added to commit but untracked files present (use "git add" to track)

“Untracked files:” indicates “sample.py” is a new file. Note that the file is NOT staged yet, so the display of color is with red, “sample.py“. Next, let’s change the status of “sample.py”. We will see the color of “sample.py” will change.

Change the “Status”

We change the status of the file by the “git add” command and check the status again by the “git status” command.

git add sample.py
git status

>>  On branch master
>>  Your branch is up to date with 'origin/master'.
>>  
>>  Changes to be committed:
>>    (use "git reset HEAD <file>..." to unstage)
>>  
>>  	new file:   sample.py
>>  

Git recognized “sample.py” as a new file!

And We have seen the change of color. The display “sample.py” of color has been changed from red to green. The green indicates that the file is now staged!

Note that you can cancel the “git status” command against “sample.py”. After the following command, you will see that “sample.py” was unstaged.

git reset sample.py

Why is the staging need?

The beginners may be unfamiliar with the concept of “staging”. Why is the staging need? The answer is to prepare for committing. Git reflects the change of a file into the version history when committing. To distinguish the files to commit, Git specifies the files clearly by the “git add” command.

“Commit” the staging files

Next, we will reflect the changes of the staged file to the local repository. This operation is called “commit”. The command is as follows.

git commit -m "This is comment."

“-m” is the option for a comment. The comment makes it possible to understand what is the intention for the change of the code.

The concept of “commit” might be unfamiliar to beginners. Why the commit is need? At the commit stage, Git does NOT reflect the modified files to the GitHub repository but to your local repository. Therefore, when developing as a team, you don’t have to worry about your modification conflicting with your teammate’s modification. At the stage of “push”, your modification of files and the version history are synchronized with the GitHub repository. This is why your teammates can distinguish your changes from those of other people!

“Push” the commited files

The final step is to synchronize your local repository with your GitHub repository. This operation is called “push”. After the “git push” command, the committed changes will be reflected in your GitHub repository.

The command is as follows.

git push

If successfully done, you can confirm on your GitHub web page that your new file “sample.py” exists in your GitHub repository.

Congratulations! This is the main flow of managing files on GitHub.

When you modified the file

From the above, we can see how to add a new file. Here, we have seen the modified file case.

Please add the something change to “sample.py”. Then, execute the “git status” command. You will see Git recognizes the file was modified.

The file is NOT staged yet, so the display of color is with red, “sample.py“.

git status

>>  On branch master
>>  Your branch is up to date with 'origin/master'.
>>  
>>  Changes not staged for commit:
>>    (use "git add <file>..." to update what will be committed)
>>    (use "git checkout -- <file>..." to discard changes in working directory)
>>  
>>  	modified:   sample.py
>>  

The difference is only the above. From here, you do just as you’ve seen.

  1. git add
  2. git commit -m
  3. git push

Summary

We have learned the basic GitHub skill. As a data scientist, GitHub skill is one of the essential skills, in addition to programming skills. GitHub not only makes it easier to manage the version of codes but also gives you opportunities to interact with other programmers.

GitHub has many code sources and knowledge. Why not use GitHub. You can get a chance to utilize the knowledge of great programmers from around the world.