《计量经济学编程——以Python语言为工具》(严子中、张毅)

Chapter 1: Get Started with Python
— 接触Python语言

March, 2024

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Outline

  • Why Python for Economists?
  • Setup the Python
  • Basics of Math and Variables
  • Built-in Functions and Modules
  • Data Structures
  • Control Flow
  • Functions and Classes
  • Use Python and Stata Together
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Why Python for Economists?

  • Python has been one of the world's most popular programming languages and has been said to be beginner-friendly.
  • It has an easy-to-use syntax and a rich programming environment.
    • it is easy for beginners to learn,
    • its simplicity does not limit the functional possibilities.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • For economists,
    • the field of econometrics is constantly evolving, but...
    • Python provides economists with a practical means of implementing new econometric methods and applying them to real-world data.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • Python is a glue language that can easily integrate with other packages such as Stata.
  • This couse explains how Python and Stata can be used interchangeably, depending on the task.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • For students new to econometrics, it is beneficial to
    • integrate econometric knowledge with a natural programming language.
    • embody the mathematical symbols and formulating them in Python can a deep understanding of the econometric theory behind it.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Setup the Python: Anaconda distribution

  • Download the freely available Anaconda Individual Distribution (anaconda.com)
  • Please visit https://b23.tv/yY5eIjI (a video guildance for Anaconda configurations).
  • Anaconda offers a Python programming environment independent of the system's in-built one. It contains a whole bunch of Python packages commonly used by people using Python for scientific computing, such as
    • NumPy, SciPy, and Matplotlib.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

The pip installer is available if you want to install further third-party Python modules unavailable via the Anaconda package manager.

pip install package_name

allows you to download and install the package and its dependencies to the existing directory tree.

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Setup the Python: Jupyter

  • A Jupyter Notebook provides you with an easy-to-use and interactive data science environment. It is an open-source web application. You can use Jupyter Notebooks for all sorts of data science tasks, including
    • data cleaning and transformation,
    • numerical simulation,
    • exploratory data analysis,
    • data visualization,
    • machine learning and much more.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Setup the Python: best IDE for beginners?

  • Beginners normally ask around what is the best IDE or editor
    • Different IDEs can serve different programming purposes, and there is no clear winner.
    • When choosing an IDE, you have first to consider your needs.
    • For econometric programming and data analysis, in addition to the Jupyter Notebook, the Spyder IDE is one of the excellent tools.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Setup the Python: Jupyter in the Cloud

  • What if you want to share a fully interactive notebook that does not require installation?
  • Or do you want to create notebooks without installing anything on your local machine?
  • JupyterLab is the latest web-based interactive development environment for notebooks, code, and data.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • There are a few rather convenient ways to run your Jupyter Notebook in the cloud, which give you access to a well-configured environment. E.g.,
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Resources

Popular online resources to help readers start with Python econometric programming.

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Setup the Python: "Hello world!"

  • From now on, we will start to program in Python.
  • A long-held belief in the programming world is printing a "Hello world!" message to the screen as our first look at a new language.
print("Hello world!")
Hello world!
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: Python calculator

Python has a self-contained calculator that can perform basic arithmetic operations such as addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (**).

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
print(12-1)
print(1/2)
print(3*3)
print(3**3) # 取底的三次方
print(16**0.5) # 开根号
11
0.5
9
27
4.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
print(2*(2-2))
print(2*2-2)
0
2

It is important to keep in mind that division can be effectively carried out using the forward-slash (/) operator. However, the double forward slash (//) can also be used for division, but it may result in improper solutions in Python

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
print(16/5)      # Proper division 即除以5的结果输出
print(16//5)     # Improper division 即除以5的结果只输出整数部分
print(17.5/5)    # Proper division
print(17.225//5) # Improper division
3.2
3
3.5
3.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: define a variable

x = 10
print(x*3+10) # 输出x进行乘以3并且加10数学计算以后的结果
40

If a new value is assigned, the variable x will be overwritten

y = 5
x = y
x
5
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Variable names are case-sensitive, meaning that uppercase and lowercase letters are considered different entities.

y = 5; Y = 15
y, Y
(5, 15)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: variable identifier

  • When working with Python, every variable that contains different contents or values has a distinctive identifier.
  • To view this identifier, you can make use of the id() function.
    • id() is a built-in function.
a = 3; b = 1
print(id(a), id(b))
139635550388528 13963555038846
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: variable identifier

If a and b refer to the same memory location in the computer, then their IDs will be identical. Here are some examples:

a = 1; b = 1
print(id(a), id(b))
139635550388464 139635550388464
a = 1; b = a
print(id(a), id(b))
140059077738736 140059077738736
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: variable identifier

  • When modifying the value of one of two variables with the same ID,
    • both value and ID of this variable changed accordingly.
a = 10
print(a, b, id(a), id(b))
1 1 140059077738736 140059077738736
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: "+=" and "*=" notations

As the previous example shows, variables can be overwritten when assigned a new value.

x = 4
x = x+1
x
5
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • Increasing a variable x by a fixed amount c is a common operation.
  • In Python, this can be achieved using the notation \+= representing an equivalent expression.
x += 1
x
6
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • Also, *=, -=, /=, **= notations act similarly.
x = 4     # 初始值设定为4
x *= 1    # 初始值基础上乘以1并对x重新赋值
print(x)  # 输出新赋值的x结果
x /= 2    # 对于新赋值的x进行除以2
print(x)  # 输出新赋值的x结果
x **= 2   # 对于新赋值的x进行2次方
print(x)  # 输出新赋值的x结果
4
2.0
4.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: numbers

In Python, there are two categories of numbers: integers and floats.

x = 10; y = 10.0; z = 10.1  # 定义变量x,y,z并赋值
type(x), type(y), type(z)   # 输出变量x,y,z的数据类型
(int, float, float)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: sys package

The sys package provides the maximum and minimum representable positive and negative finite floats.

import sys
print(sys.float_info.max, sys.float_info.min)
1.7976931348623157e+308 2.2250738585072014e-308
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

The float type also has special values such as inf, which represents infinity.

x = 1.8e+308; y = -1.8e+308
x, type(x), y, type(y) 
(inf, float, -inf, float)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: numbers

  • As Python programmers in the scientific field, it is important to be cautious about the accuracy of floats and the potential for computational errors.
  • Consider an example where two floats with slight differences in values are identified as the same value in Python, despite having unique IDs:
x = 1e-20+1
y = 1e-30+1
print(id(x), id(y))
x == y # 判断变量x与y是否相等
140182128163120 140182128161648
True
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: numbers

When working with floating numbers, it may be necessary to manipulate them using functions like abs(), round(), and int(). One commonly used function is abs(), which returns the absolute value of a number:

abs(-10.1) # 对目标数字取绝对值
10.1

To round a floating point number to an integer in Python, one might try the round() function.

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: numbers

# 整数部分为奇数,round()函数可正常进行四舍五入
print(round(9.4))
print(round(9.5))
# 整数部分为偶数,小数部分为5,round()函数向上取整
print(round(10.4))
print(round(10.5))
9
10
10
11
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

The int() function also converts float variables to integers. However, it seems that this function simply discards the numbers after the decimal point.

print(int(10.1))
print(int(9.9))
10
9
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: strings

Variables of type string can store a variety of characters, including letters, numbers, spaces, commas, and more. To define a string variable:

x = "Hello my colleagues"
print(x)
Hello my colleagues
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: strings

  • Note that one cannot use numeric functions such as addition or subtraction on string variables.
  • The + operator in fact concatenates strings.
x = "1" ; y = "2"
x+y
'12'
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: use quotation in string

  • When the string value should contain single quotation marks,
    • you must use double quotes to define the string variable.
  • The following example illustrates how it works.
x = 'You are my "sunshine"'
print(x)
y = "You are my 'sunshine'"
print(y)
z = "This is Mike's basketball"
print(z)
You are my "sunshine"
You are my 'sunshine'
This is Mike's basketball
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: concatenate and split string variables

  • A list of available string methods can be found in the Python reference documentation.
  • The "addition" (+) can be applied to concatenate string variables.
x = "Hello"
y = "my colleagues"
z = x + y
print(z)
Hellomy colleagues
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: concatenate and split string variables

It seems that we missed a white space character in this sentence. So,

z = x + " " + y 
print(z)          
Hello my colleagues

split() converts a string into a list of strings:

z.split(), z

(['Hello', 'my', 'colleagues'], 'Hello my colleagues')
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: concatenate and split string variables

In summary,

  • split() will separate the string where it finds white space such as one space or several spaces or a tab character.
  • If one specifies a separator character to split() function, the string will split into parts.
  • For example, we split several sentences by the Chinese full stop character.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
x = "今天阳光明媚。今天温度较高。今天傍晚去篮球场。"
y = x.split("。")
y
['今天阳光明媚', '今天温度较高', '今天傍晚去篮球场', '']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

The opposite string method to split() is join():

"。".join(y)
'今天阳光明媚。今天温度较高。今天傍晚去篮球场。'
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: upper case and lower case

For the string with English letters, upper() and lower() functions can change letters into upper case and lower case, respectively.

name = "python prograMMing fOr ecoNOmeTRics: a BEGINNER's guiDE"
print(name.upper())     
print(str.upper(name)) 
print(name.lower())    
print(str.lower(name)) 
PYTHON PROGRAMMING FOR ECONOMETRICS: A BEGINNER'S GUIDE
PYTHON PROGRAMMING FOR ECONOMETRICS: A BEGINNER'S GUIDE
python programming for econometrics: a beginner's guide
python programming for econometrics: a beginner's guide
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: capitalizations

Further,

  • capitalize()function can be used to convert the first character in the first word to uppercase,
  • title() function converts the first character in each word to uppercase and the remaining characters to lowercase in the string and returns a new string.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: capitalizations

name = "python prograMMing fOr ecoNOmeTRics: a BEGINNER's guiDE"
print(name.capitalize()) # 输出变量name并且首字母大写,其余均采用小写
Python programming for econometrics: a beginner's guide
print(name.title()) # 输出变量name并采用标题格式,即每个首字母大写
Python Programming For Econometrics: A Beginner'S Guide
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: convert between string and number

  • Once we define a string variable containing only an integer,
    • int() function can convert it to an integer.
a = "10"
int(a), type(a), type(int(a))
(10, str, int)
  • For the string containing only a floating point number,
    • eval() function converts a string to a number.
b = "10.09"
eval(b), type(b), type(eval(b))
(10.09, str, float)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Basics of Math and Variables: convert between string and number

Another way around one could also transform numbers into strings.

c = 10
d = 10.123
str(c), type(str(c)), str(d), type(str(d))
('10', str, '10.123', str)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules

  • The Python interpreter has many functions built into it.
  • Built-in functions: print(), type() and split().
  • For a complete list of built-in functions and their usage:
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules

  • Moreover, the module in Python groups a list of functions.
  • Python’s standard library contains a vast collection of modules.
  • Although Python is a general-purpose programming language,
    • we use math and statistics intensively for econometric programming.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: math module

  • Python math module provides mathematical functions such as
    • , , , and many others.
  • To use functions such as the math.log() from the math module,
    • we need to import the module first and then call the function.
 import math
 math.log(1) 
0.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: math module

pi function of math module gives the number of .

math.pi
3.141592653589793
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: math module

To use the compound function notation in Python:

math.log(math.e), math.cos(math.pi)
(1.0, -1.0)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: math module

Suppose one wants to import specific function(s) from a module:

 from math import log, sin 

This means one could use sin(x) and log(x) directly without adding the math. prefix.

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: math module

Finally, there is another notation we should pay attention to:

 from math import *

This command does not introduce a specific function from math package but imports _all_many functions.

第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: input and output

  • It is important to verify the contents and intermediate results of the program.
  • The print() function is widely used to display information on the screen.
  • Now, we can learn how to use print in a more advanced and professional manner.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

For example, one could start a new line using the "slash n" notation:

print("Line one\nLine two")
Line one
Line two
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Define a tab space by using the "slash t" notation:

print("Word one\tWord two")
Word one Word two
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

A more comprehensive example:

d = "Fourth line"
print("First line\n"+"Second line\n"+"Third line\n" +d)
First line
Second line
Third line
Fourth line
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: formatted print

  • When working with econometric methods, we might want to
    • display numerical values and accompanying messages to present estimates in a clear and informative way...
  • In Python, this can be achieved by using format specifiers and variables.
  • Here is an example.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: formatted print

print("beta = %d, alpha = %f" % (10,2.56))
beta = 10, alpha = 2.560000
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: formatted print

An overview of the commonly used format identifier with several examples:

------------------------------------------------------------------- 
Format identifier      Style                    Example
------------------------------------------------------------------- 
%f                     floats                  149597870700.000000
%d                     integers                149597870700
%s                     string                  "149597870700"
%e                     exponential notation    1.495979e+11
%g                     shorter of %e, %f       1.49598e+11
------------------------------------------------------------------- 
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: formatted print

It is worth noting that

  • the format specifier of type %W.Df means that a float should be printed with...
  • a total width of W characters and D digits behind the decimal point:
# 输出格式为float
print("Est. value = %f" % (12345.12345))
# 输出格式为integer
print("Est. value = %d" % (12345.12345))
# 输出格式为float,总计宽度8位,且小数点后1位
print("Est. value = %8.1f" % (12345.12345))
# 输出格式为float,总计宽度10位,且小数点后1位
print("Est. value = %10.1f" % (12345.12345))
# 输出格式为float,总计宽度15位,且小数点后3位
print("Est. value = %15.3f" % (12345.12345))
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: formatted print

from math import pi
beta = 2.5613214
print(" beta = %5.3f \n pi = %5.2f \n and pi^4 = %e "
      % (beta,pi,pi**4))
 beta = 2.561 
 pi =  3.14 
 and pi^4 = 9.740909e+01 
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: read text files

  • Built-in function open() allows us to access a file and then read or write some text to a file to a directory on our device.
  • We need to find a place in our machine to save documents. So we can make use of the built-in os.getcwd() command to get the current working directory:
import os # 引入os模块
cwd = os.getcwd() # 定义变量cwd,记录现在工作的路径
cwd
'/home/user'
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: reawrited text files

Let us write some texts into a file named "example_text.txt" and save it in the current working directory:

write_file = open(cwd+"/example_text.txt", "w")
write_file.write("The 1st line\n"+"The 2nd line\n"+"The 3rd line")
write_file.close()
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Built-in Functions and Modules: write text files

  • In the same manner, it is possible to
    • import a text file that already exists on the device and,
    • print out the results.
read_file = open(cwd+"/example_text.txt", "r")
text = read_file.read()
read_file.close()
print(text)
The 1st line
The 2nd line
The 3rd line
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures

Full list of the object types:

Python Tips :
Python的数据类型主要有:

int(整数)、string(字符串)、float(浮点数)、complex(复数);
bool(布尔型):值为True或False; 
list(列表):包含若干元素的序列,有序且可重复、可更改; 
tuple(元组):元组类似于列表,但不可更改; 
dictionary(字典):键值对的形式呈现,有序且不可重复、可更改; 
set(集合):集合类似于字典,但不可更改。
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: tuples

  • Both the tuple and the list are Python object sequences.
    • They are very similar but different.
  • The tuple object is typically defined using parentheses,
    • in which a comma separates each element.
a = (10,20,30.56,"Hello")
print(a, type(a))
(10, 20, 30.56, 'Hello') <class 'tuple'>
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

When defining a tuple, the parentheses are not necessary. For instance,

a = 10,20,30.56,"Hello"
print(a, type(a))
(10, 20, 30.56, 'Hello') <class 'tuple'>
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: lists

A list in Python is created using the squared brackets, for example:

a = [10,20,30.56,"Hello"]
print(a, type(a))
[10, 20, 30.56, 'Hello'] <class 'list'>
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

min, max and sum of a List

  • If we have a list that only includes numbers,
    • we can use the min(), max(), and sum() functions to find the lowest, highest, and sum of values of the list, respectively.
x = [0,1,2,3,4,5,10.85] 
[min(x), max(x), sum(x)]
[0, 10.85, 25.85]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

  • For elements of a list are English letter characters:
letters = ["a","b","c","d"] 
[min(letters), max(letters)] # 输出list字母顺序的最小值与最大值
words = ["Python","Stata","R","Java"]
[min(words), max(words)]
# 输出list首字母顺序最小的string与首字母顺序最大的string
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

Length of Sequences

  • To obtain the length of a string,
    • we can use the len() function to count the number of characters.
x = "Hello"
y = "my colleagues"   # 定义变量y
len(x), len(y)        # 输出变量长度,注意空格也算一个长度
(5, 13)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

  • If using the len() function for a list,
    • the number of elements in a list can be obtained.
print([x,y])
len([x,y])
['Hello', 'my colleagues]
2
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

Combine Lists and Remove Elements from a List

  • Similar to the string variables,
    • the + operator can concatenate lists and tuples in order.
a = [10,20,30]
b = [10,20,30,"Hello"]
c = [1,2,3]
print(a+b)
[10, 20, 30, 10, 20, 30, 'Hello']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

The append() attribute to a list could append another list to the end (not the beginning) of it as a single element (not elements):

a.append(b)
a
[10, 20, 30, [10, 20, 30, 'Hello']]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: operations

  • To eliminate a particular item from a list,
    • one can use the remove() notation.
a = [10,20,30,"Hello"]
a.remove(20)
a
[10, 30, 'Hello']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: indexing

  • To understand the index rule in Python, we can consider operations a[i]. a[i] returns the -th element of variable a.
  • It is important to note that Python (like C but unlike Matlab) starts counting indices from zero, not one!
a = [10,20,30,"Hello"]
a[0], a[1], a[2], a[3]
(10, 20, 30, 'Hello')
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Python provides a useful method for backward indexing of lists or tuples. For example,

  • "-1" is one element from the back of the list.
  • "-2" will return the 2nd last element:
a = [10,20,30,"Hello"]
#分别输出倒数第一个、第二、第三个element
a[-1], a[-2], a[-3]
('Hello', 30, 20)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • With the index rule in Python, one could insert elements into a specific list position.
  • Specifically, a.insert(i,content) method allows us to add content as th element in the list a.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
# 将指定的element插入列表索引为0的位置
a = [10,20,30,"Hello"]
a.insert(0,'IESR')
print("Add 'IESR' to position 0:", a)
# 将指定的element插入列表索引为1的位置
a = [10,20,30,"Hello"]
a.insert(1,'IESR')
print("Add 'IESR' to position 1:", a)
# 将指定的element插入列表索引为倒数1的位置
a = [10,20,30,"Hello"]
a.insert(-1,'IESR')
print("Add 'IESR' to position -1:", a)
Add 'IESR' to position 0: ['IESR', 10, 20, 30, 'Hello']
Add 'IESR' to position 1: [10, 'IESR', 20, 30, 'Hello']
Add 'IESR' to position -1: [10, 20, 30, 'IESR', 'Hello']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: slicing

  • a[i] locates a single -th element in the list, a[i:j] notation gives elements i up to j-1.
  • Python "half-open interval" rule, i.e.,
    • keeping elements at positions in the interval .
a = [10,20,30,"Hello"]
a[0:3] # 输出list编号0,1,2三个element
[10, 20, 30]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: slicing

Similarly, we can also slice any elements from a list.

print(a[0:2]) # 输出变量a中0,1编号的list
print(a[1:4]) # 输出变量a中1,2,3编号的list
[10, 20]
[20, 30, 'Hello']

As the starting index is 0, it is equivalent to:

a[0:] # 输出变量a中第0到最后一个元素的list
[10, 20, 30, 'Hello']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: slicing

  • Using the backward index, we can set the negative index to refer to the end of the sequence:
a[:-1] # 输出变量a中0到倒数第二个元素的list
[10, 20, 30]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: slicing

a = [10,20,30,"Hello"]
# 输出变量a中前三个元素
print(a[:3])
# 输出变量a中除第一个(通常计数观念)元素外的所有元素
print(a[1:])
[10, 20, 30]
[20, 30, 'Hello']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Data Structures: slicing

For the string variables, the slicing numbers correspond to the alphabetic order of letters. For example, the string variable slicing can be demonstrated as the following:

a= "Hello IESR Colleagues"
# 对于字符串,列表索引对应字符串中字符的顺序
print(a[0:3]) # 输出a中列表索引0,1,2三个字符
print(a[:3])  # 输出a中列表索引0,1,2三个字符
print(a[2:3]) # 输出a中列表索引的第2个字符
print(a[:])   # 输出a的全部字符
Hel
Hel
l
Hello IESR Colleagues
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow

  • The control flow is the order in which the program's code executes.
  • It starts from the top to bottom to process the script.
  • When more than one command is written in the same line and separated by ;,
    • then the interpreter processes from left to right.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: if-elif-else statements

  • The if-elif-else is a decision control statement to
    • test conditions and execute instructions based on the actual condition.
  • we use a simple if statement to
    • test whether the variable x is strictly greater than zero.
x = 10
if x > 0: # 如果变量x满足x>0
    print("x is greater than 0")
x is greater than 0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: if-elif-else statements

if-else statement evaluates the condition and will run the body of if statement if the test condition is met.

x = 0
if x > 0: # 如果满足x>0条件
    print("x is greater than 0")
else:     # 如果不满足x>0条件
    print("x is negative or 0")
x is negative or 0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: if-elif-else statements

x = -10
if x == 0:  # 如果满足x=0条件
    print("x is 0")
elif x > 0: # 如果满足x>0条件
    print("x is greater than 0")
else:       # 如果不满足以上任意两个条件
    print("x is negative")
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: if-elif-else statements

A further example:

x = 5
y = 10
z = 15
if x > y:    
    if x > z: 
        print("x is greater than y and z")
    else:
        print("x is greater than y but not greater than z")
elif y > z:  
    print("x is not greater than y, y is greater than z")
else:
    print("z is the largest")
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: for statements

  • In Python, both for and while loops are used for repeating a group of instructions.
  • The for-loop is used for iterating over a sequence, such as a list or string.
for i in ["Kitten","Cat","Feline"]:
    print(i, end=", ")
for i in [0,1,2,3]: # 循环list当中的每个数字
    print(i, end=", ")
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: for statements

Examples of for-loop

Next, we compute: :

j = 0
for i in [0,1,2,3]:
    j += i 
    print("i = %d, j = %d" % (i,j))
i = 0, j = 0
i = 1, j = 1
i = 2, j = 3
i = 3, j = 6
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • [0,1,2,3] can be replaced with the range(0,4)function.
j = 0
for i in range(0,4):
    j += i 
    print("i = %d, j = %d" % (i,j))
i = 0, j = 0
i = 1, j = 1
i = 2, j = 3
i = 3, j = 6
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • When using a for-loop, it is common to include an if condition.
  • This allows us to perform operations on individual elements in a list or string based on specific conditions.
cars = ['audi', 'bmw', 'benz', 'toyota']
for x in cars:
    if x == "bmw": ”
        print(x.upper(), end=", ")
    else:
        print(x.lower(), end=", ")
audi, BMW, benz, toyota, 
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Example demonstrates how to iteratively write texts to a file on our device:

import os 
cwd = os.getcwd()
# 写入文件
write_file = open(cwd+"/example_iterative_text.txt", "w")
for i in range(1, 4):
    write_file.write("%d x 5 = %d \n" % (i,i*5))
write_file.close()
# 读取文件内容
read_file = open(cwd+"/example_iterative_text.txt", "r")
text = read_file.read()
read_file.close()
print(text)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: while statements

  • while loop executes a block of statements repeatedly until a specific condition is met.
  • The loop continues until the condition becomes false.
x = 5
y = 0
while y < x:
    # 循环判断变量y小于变量x,如满足则执行下一行
    print(y, end = " ")
    y += 1 # 变量y被循环加1重新赋值
print("Exit while-loop")
0 1 2 3 4 Exit while-loop
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: while statements

Consider that we want to find out the duration for which 10000 yuan needs to be kept in a savings account to reach 20000 yuan only through annual interest payments at a rate of 1.75%.

amount = 10000 # 初始10000元人民币
rate = 0.0175 # 1.75%的年利率
year = 0
while amount < 20000: # 重复直至amount达到20000
    amount = amount + amount*rate
    year = year+1
print('We need', year , 'years to reach', amount , 'yuan RMB.')
We need 40 years to reach 20015.973431860315 yuan RMB.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: break and continue statements

  • However, there are situations where
    • you might want to end the current iteration
    • or stop the loop entirely without checking the test expression.
for i in sequence:
    # Code inside the for-loop
    if condition:
        break 
        # Will skip below and move to the code outside the for-loop 
    # Code inside the for-loop
# Code outside the for-loop
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: break statement

  • break statement can stop a for loop.
for i in "ABCDEF":
    if i == "D":
        break
    print(i)
print("The end")
A
B
C
The end
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: continue statement

continue statement allows skipping the remaining code within a loop for the current iteration only.

for i in sequence:
    # Code inside the for-loop
    if condition:
        continue 
        # Will GO BACK to previous code inside the for-loop 
    # Code inside the for-loop
# Code outside the for-loop
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Control Flow: continue statement

for i in "ABCDEF":
    if i == "D":
        continue
    print(i)
print("The end")
A
B
C
E
F
The end
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: functions

  • A logical block of statements can be grouped using the Python function. The function can be communicated through its interface, providing arguments (parameters) to the function and obtaining the results back.
from math import log
log(1) 
0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: functions

To define a Python function by ourselves, we use the def keyword:

def function_name(arg1, arg2, ....., argn):  
     # 该函数执行的语句
     return results  

The Python interpreter finds the def keyword and remembers this function_name.

def square_values(x):
    return x*x
square_values(10)
100
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: functions

Examples of Python Functions

  • In the first example, we pass a string variable into the function and return its upper and lower case in order.
def upper_lower(s):
    return s.upper(), s.lower()
sentence = 'python prograMMing fOr ecoNOmeTRics'
upper_lower(sentence)
('PYTHON PROGRAMMING FOR ECONOMETRICS', 'python programming for econometrics')
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • To access the element in the results of the function:
results = upper_lower(sentence)
print("The orginal sentence is `%s`. \nIts upper case is `%s`. \nIts lower case is `%s`." % (sentence,results[0],results[1]) )
The orginal sentence is `python prograMMing fOr ecoNOmeTRics`. 
Its upper case is `PYTHON PROGRAMMING FOR ECONOMETRICS`. 
Its lower case is `python programming for econometrics`.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

We define a function to double the content of an object:

def double_k(k):
    k = k+k
    return k
k_string = "Hello"
double_k(k_string)
'HelloHello'
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
def double_values(k):
    for i in range(len(k)):
        k[i] = k[i]*2
        print(k)
k_input = [0,1,2,10,20]
double_values(k_input)
[0, 1, 2, 10, 20]
[0, 2, 2, 10, 20]
[0, 2, 4, 10, 20]
[0, 2, 4, 20, 20]
[0, 2, 4, 20, 40]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
import copy
def double_values_local1(k):
    k_local = copy.copy(k)
    for i in range(len(k)):
        k_local[i] = k_local[i]*2
    return k_local
k_input = [0,1,2,10,20]
double_values_local1(k_input), k_input
([0, 2, 4, 20, 40], [0, 1, 2, 10, 20])
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

One way to pass the global object into a function is by applying the global command to ask the function to search for the global object.

def double_values_local2():
    global k_input
    k_local = copy.copy(k_input)
    for i in range(len(k_input)):
        k_local[i] = k_local[i]*2
    return k_local
k_input = [0,1,2,10,20]
double_values_local2(), k_input
([0, 2, 4, 20, 40], [0, 1, 2, 10, 20])
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: setting default values

  • Users can set default values for function arguments (parameters),
    • making them optional for users.
  • The following example defines a function called multiply_m
    • which takes two arguments of j and m and computes .
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: setting default values

  • The argument m has a default value of 3 by setting m=3 in the def statement.
  • Users can choose to provide values of both j and m.
def multiply_m(j, m=3):
    print("%d * %d = %d" % (j,m,j*m))
multiply_m(5)
5 * 3 = 15
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: class

  • The Python class is a collection of Python functions and introduces some new syntax,
    • three new object types,
    • and some new semantics.
    class ClassName:
        <statement-1>
        ...
        <statement-N>
  • class statement must be executed before they have any effect.
  • A class may define a special method named __init__():
        def __init__(self):
            self.data = []
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: class

Consider a simple example of a class called First_class:

class First_class:
    def __init__(self, x1, x2):
        # 该函数首个参数为self
        self.r = x1
        self.i = x2

x = First_class(x1=3.0, x2=-4.5)
x, x.r, x.i
(<__main__.First_class at 0x7ff7e4928280>, 3.0, -4.5)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: class

  • Class variables are for attributes and methods shared by all class instances.
  • Instance variables are for data unique to each instance,
    • and class variables are for attributes and methods shared by all class instances.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: class

class Puppy:
    # 所以实例(instances)共用的Class变量
    kind = 'Snoopy'
    def __init__(self, name):
        # 仅一个实例使用的实例变量
        self.name = name

d = Puppy('Fido')
print(d.kind) # shared by all instances
print(d.name) # unique instance variable
Snoopy
Fido
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: class

In the below code, the function add_trick is executed by d.add_trick('Piu') command.

class Puppy:
    kind = 'Snoopy'
    def __init__(self, name):
        self.name = name
        self.tricks = []
    def add_trick(self, trick):
        self.tricks.append(trick)
d = Puppy('Fido')
d.add_trick('Piu')
d.tricks
['Piu']
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: functional programming tools

  • One could write code in a more functional style when coding in Python.
    • Functional programming means one writes code using functions by avoiding repeated using objects.
    • i.e., map, filter and lambda
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: list comprehensions

Consider the a for-loop which multiplies each element in a list [0,1,2,3] by 2:

y = [] # 定义一个空list
for x in range(4):   # 循环x的值在0,1,2,3的范围
    z = x*2          # 计算z的值
    y.insert(x,z)    # 将循环中每个x对应的z值插入list y当中
y
[0, 2, 4, 6]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: list comprehensions

In the list comprehension way:

[x*1 for x in range(10) if x >5]
# 在list中循环x的取值兵运行对应公式且判断是否满足x>5
[6, 7, 8, 9]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: list comprehensions

The list comprehension consists of an expression to implement (x*2 in the above case) followed by a for clause and returns a list of results.

[x*1 for x in range(10) if x >5]
# 在list中循环x的取值兵运行对应公式且判断是否满足x>5
[6, 7, 8, 9]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: list comprehensions

If using a for-loop, the above code can be more lengthy:

>>> y = []
>>> for x in range(10):
>>>     if x > 5:
>>>         y.append(x*1)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: anonymous function

  • The def keyword defines a function with an assigned name,
    • and then one can repeatedly call this function.
  • In some practices, we would have a function for one-time use.
    • The anonymous (or lambda) function** could help.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • Let us start with considering a function named f with one parameter ():
def f(x): # 定义一个方程 f(x) 注意跟冒号
    return(2*x)
f(1)
2
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: anonymous function

  • lambda command translates the above code into a one-line Python style...
lambda x: 2*x
<function __main__.<lambda>(x)>

To call this anonymous function:

(lambda x: 2*x)(1)
2
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: anonymous function

  • Define multiple parameters in the lambda function.
  • Consider

where and .

def f(x, y, z):
    return(7*x + x**5 + 3*y + 6/z)
f(1,2,3)
16.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: anonymous function

The lambda equivalent code can clarify the typical use of def:

(lambda x,y,z: 7*x + x**5 + 3*y + 6/z)(1,2,3)
16.0
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: map a function

  • The map(fun, seq) function applies to a pre-defined function fun to each element in a sequence seq, and returns a list with the same length as seq.
  • Consider a simple function :

def f(x):
    return(2*x + x**2)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: map a function

Now we would like to produce a list that contains the results of . Using the for-loop coding style, this can be done by:

h = []
for x in range(0,11):
    h.append(f(x))
h
[0, 3, 8, 15, 24, 35, 48, 63, 80, 99, 120]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: map a function

Employing the map() function, a one-line Python does the same job:

list(map(f, range(0,11)))
[0, 3, 8, 15, 24, 35, 48, 63, 80, 99, 120]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

When using map, it is often combined with the anonymous function lambda such that the function is not needed to be pre-defined using def.

list(map(lambda x: 2*x + x**2, range(0,11)))
[0, 3, 8, 15, 24, 35, 48, 63, 80, 99, 120]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: filter

Filter Elements in a Sequence

  • filter(fun, seq) function has similar arguments as map.
  • fun entering filter should return either True or False which are reserved names.
def positive_checker(x):
    if x > 0:
        return(True)
    else:
        return(False)
positive_checker(1), positive_checker(-1)
(True, False)
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Functions and Classes: filter

  • filter creates a new list populated with the results of calling the provided function on every element in a sequence.
list(filter(positive_checker, range(-10,10,1)))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)
  • Together with the lambda function, one avoids defining def beforehand:
list(filter(lambda x: x > 0, range(-10,10,1)))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Use Python and Stata Together

  • Stata 17 or higher offers complete integration of Python-Stata.
  • Since Python and Stata have unique benefits in econometric analysis, researchers can leverage Python to code data analysis routines that may not be present in Stata and then utilize Stata for other tasks.
  • We highlight a standard method for using Python and Stata in conjunction.
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Use Python and Stata Together: configurations

  • pystata Python package allows us to call Stata 17 (or later version) from Python.
  • To enable Python to find the Stata software installed on our machine, another Python package called stata_setup can help.
pip install pystata
pip install stata_setup
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Suppose Stata is installed in '/Applications/Stata/' directory and the version is the Stata/MP:

import os
os.chdir('/Applications/Stata/utilities')
from pystata import config
config.init('mp')
  ___  ____  ____  ____  ____ ©
 /__    /   ____/   /   ____/      17.0
___/   /   /___/   /   /___/       MP—Parallel Edition

 Statistics and Data Science       Copyright 1985-2021 StataCorp LLC
                                   StataCorp
                                   4905 Lakeway Drive
                                   College Station, Texas 77845 USA
                                   800-STATA-PC        https://www.stata.com
                                   979-696-4600        stata@stata.com
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Use Python and Stata Together: call Stata from Python

The stata.run() function under the pystata module can be used to execute Stata commands.

from pystata import stata
# 运行一段Stata代码
stata.run(
    '''clear all
    sysuse auto
    sum price''')
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Use Python and Stata Together: call Stata from Python

See what we can obtain:

. clear all
. sysuse auto
(1978 automobile data)
. sum price

Variable |   Obs        Mean    Std. dev.    Min     Max
---------+----------------------------------------------
   price |    74    6165.257    2949.496    3291   15906
第一章配套课件
《计量经济学编程——以Python语言为工具》(严子中、张毅)

Use Python and Stata Together: call Stata from Python

  • Many Stata commands save results in the r() and e() classes.
  • In Python, we usestata.get_return(), and stata.get_ereturn() to store them as dictionary variables.
stata.get_return()
{'r(N)': 74.0,
 'r(sum_w)': 74.0,
 'r(mean)': 6165.256756756757,
 'r(Var)': 8699525.974268788,
 'r(sd)': 2949.495884768919,
 'r(min)': 3291.0,
 'r(max)': 15906.0,
 'r(sum)': 456229.0} 
第一章配套课件