焦点速递!Pandas时间类型转换与处理的实现示例
来源:脚本之家    时间:2022-07-29 16:06:49
目录
案例1案例2案例3案例4补充知识案例5案例6案例7案例8案例9案例10

在平时的需求开发中,经常涉及到利用Pandas处理日期相关类型字段的转换和操作,为此特地记录以下练习案例,帮助大家的同时,也便于日后的学习和复盘

案例1

问题:提取"W1|2022/7/28"字段中的年月日信息,取名为week_start,即一周开始的日期,并根据week_start计算出该周结束的具体日期week_end


(资料图)

import pandas as pd
import datetime
df1 = pd.DataFrame([[6,3],[6,3]], columns = ["Working day","W1|2022/7/28"])
# 一周开始的日期
# "2022/7/28"——>str类型
week_start = df1.columns[1].split("|")[1]
# 将start_day类型转换成date类型(2022-07-28)
week_start = datetime.datetime.strptime(week_start, "%Y/%m/%d").date()
# 一周结束的日期(2022-08-03)
week_end = week_start + datetime.timedelta(days=6)

df1

案例2

问题:根据"Date"字段生成"Date - 2"字段

import pandas as pd
from datetime import timedelta
from datetime import datetime
 
df2 = pd.DataFrame([[1,"20191031"],
                   [2,"20191106"],
                   [3,"20191106"]],columns=["Id","Date"])
# "Date"字段中的值减去2天,生成"Date - 2"字段
df2["Date - 2"] = df2["Date"].apply(lambda x:(datetime.strptime(x,"%Y%m%d") - timedelta(days=datetime.strptime(x,"%Y%m%d").weekday())).strftime("%Y%m%d"))

df2

案例3

问题:从字符串表示的日期时间中仅获取“年/月/日”

import pandas as pd
from datetime import datetime
 
df3 = pd.DataFrame([[1,"2017-01-02 00:00:00"],
                   [2,"2017-01-09 00:00:00"]
                   ],columns = ["Id","Wk"])

df3

错误写法

# 运行以下代码会报错"str" object has no attribute "strftime"
df3["new_wk"] = df3["Wk"].apply(lambda x:x.strftime("%Y%m%d"))

正确写法

# 先利用.strptime()将str格式的变量转化成datetime下的时间格式
# 然后再利用.strftime()获取“年/月/日”
df3["Wk"] = df3["Wk"].apply(lambda x:datetime.strptime(x,"%Y-%m-%d %H:%M:%S"))
df3["new_Wk"] = df3["Wk"].apply(lambda x:x.strftime("%Y/%m/%d"))

处理过后的df3

案例4

问题:将"月/日/年 时间"格式的值转换为"年月日"(10/11/19 05:28:27 => 20191011)

import pandas as pd
 
df4 = pd.DataFrame([["A","10/11/19 05:28:27","08/04/20 08:38:59"],
                   ["B","10/11/19 05:28:27",None],
                   ["C","10/11/19 05:28:27",None]
                  ],columns = ["site","creation_date","closure_date"])

df4

# 将"creation_date"栏位的值变形
# 10/11/19 05:28:27 => 20191011
df4["creation_date"] = df4["creation_date"].apply(lambda x:pd.to_datetime(x).strftime("%Y%m%d"))
 
# 将"closure_date"字段中nan值填充为0
df4["closure_date"] = df4["closure_date"].fillna(0)
# 筛选closure_date"字段中值为0的数据记录,取名为df4_na
df4_na = df4[df4["closure_date"].isin([0])]
# 筛选closure_date"字段中值不为0的数据记录,取名为df4
df4 = df4[~df4["closure_date"].isin([0])]
 
# 将"closure_date"栏位的值变形
# 08/04/20 08:38:59 => 20200804
df4["closure_date"] = df4["closure_date"].apply(lambda x:pd.to_datetime(x).strftime("%Y%m%d"))
 
df4 = pd.concat([df4, df4_na], ignore_index = True)

处理过后的df4

补充知识

我们通常使用pd.to_datetime()和s.astype("datetime64[ns]")来做时间类型转换

import pandas as pd
 
t = pd.Series(["20220720","20220724"])
# dtype: datetime64[ns]
new_t1 = pd.to_datetime(t)
new_t2 = t.astype("datetime64[ns]")

t

new_t1

new_t2

案例5

问题:添加字段"Week",逐行递增

import pandas as pd
 
df5 = pd.DataFrame(columns=["Week","Materials"])
all_material = ["A32456","B78495"]
 
for row in range(0,3):
    week = row + 1
    datas = [week, all_material]
    df5.loc[row] = datas
"""
df5:
 
  Week         Materials
0    1  [A32456, B78495]
1    2  [A32456, B78495]
2    3  [A32456, B78495]
"""
print(df5)

案例6

问题:日期型转换为字符型

import datetime
today = datetime.date.today() # date类型 2022-07-28
today.strftime("%Y-%m-%d") # "2022-07-28"
import datetime
dt = datetime.datetime.now() # datetime类型 2022-07-28 22:46:20.528813
dt.strftime("%Y-%m-%d") # "2022-07-28"
import datetime
today = str(datetime.date.today()) # str类型 2022-07-28
today.replace("-","") # "20220728"

案例7

问题:文本型转日期型

#文本型日期转为日期型日期
import pandas as pd
from datetime import datetime
df7=pd.DataFrame({"销售日期":["2022-05-01","2022-05-02","2022-05-03","2022-05-04","2022-05-05","2022-05-06","2022-05-07","2022-05-08","2022-05-09","2022-05-10"],
                "城市":["兰州","白银","天水","武威","金昌","陇南","嘉峪关","酒泉","敦煌","甘南"]})

df7

文本型转为日期型可用datetime.strptime函数

# "%Y-%m-%d"表示将文本日期解析为年月日的日期格式
df7["日期"] = df7["销售日期"].map(lambda x:datetime.strptime(x,"%Y-%m-%d"))

文本型转为日期型也可用pd.to_datetime函数

# "%Y-%m-%d"表示将文本日期解析为年月日的日期格式
df7["日期"] = pd.to_datetime(df7["销售日期"],format="%Y-%m-%d")

处理过后的df7

案例8

问题:提取日期字段的年份、月份、日份和周数

import pandas as pd
from datetime import datetime
df8=pd.DataFrame({"销售日期":["2022-05-01","2022-05-02","2022-05-03","2022-05-04","2022-05-05","2022-05-06","2022-05-07","2022-05-08","2022-05-09","2022-05-10"],
                "城市":["兰州","白银","天水","武威","金昌","陇南","嘉峪关","酒泉","敦煌","甘南"]})
 
df8["日期"] = df8["销售日期"].map(lambda x:datetime.strptime(x,"%Y-%m-%d"))

df8

#由日期数据提取年
df8["年份"] = df8["日期"].apply(lambda x: x.year)
df8["年份"] =df8["年份"].astype(str)+"年"
 
#由日期数据提取月
df8["月份"] = df8["日期"].apply(lambda x: x.month)
df8["月份"] =df8["月份"].astype(str)+"月"
 
#由日期数据提取日
df8["日份"] = df8["日期"].apply(lambda x: x.day)
df8["日份"] =df8["日份"].astype(str)+"日"
 
# 日期中的周使用date.isocalendar()[1]提取
#根据日期返回周数,以周一为第一天开始
df8["周数"] = [date.isocalendar()[1] for date in df8["日期"].tolist()]
df8["周数"] = df8["周数"].astype(str)+"周"

处理后的df8

案例9

问题:借助offset时间偏移函数将日期加3天

import pandas as pd
from datetime import datetime
df9=pd.DataFrame({"销售日期":["2022-05-01","2022-05-02","2022-05-03","2022-05-04","2022-05-05","2022-05-06","2022-05-07","2022-05-08","2022-05-09","2022-05-10"],
                "城市":["兰州","白银","天水","武威","金昌","陇南","嘉峪关","酒泉","敦煌","甘南"]})
 
df9["日期"] = df9["销售日期"].map(lambda x:datetime.strptime(x,"%Y-%m-%d"))

df9

#借助offset时间偏移函数将日期加3天
from pandas.tseries.offsets import Day
df9["日期_3"]=df9["日期"]+Day(3)

处理后的df9

案例10

问题:将文本型日期转换为日期型日期

#文本型日期转为日期型日期
import pandas as pd
import datetime as dt
from datetime import datetime
df1=pd.DataFrame({"销售时间":["2022-05-01 00:00:00","2022-05-02 00:00:00","2022-05-03 00:00:00","2022-05-04 00:00:00","2022-05-05 00:00:00",
                         "2022-05-06 00:00:00","2022-05-07 00:00:00","2022-05-08 00:00:00","2022-05-09 00:00:00","2022-05-10 00:00:00",]})
#df["日期"]=df["销售日期"].map(lambda x:datetime.strptime(x,"%Y-%m-%d"))
df1["日期_x"]=df1["销售时间"].str.split(" ",expand=True)[0]
df1["日期_y"]=pd.to_datetime(df1["销售时间"],format="%Y-%m-%d")
df1

df10

日期中带有时分秒"00:00:00",有如下方法将其处理为"%Y-%m-%d"形式

df10["日期"]=df10["销售时间"].str.split(" ",expand=True)[0]

处理后的df10

到此这篇关于Pandas时间类型转换与处理的实现示例的文章就介绍到这了,更多相关Pandas时间类型转换内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

关键词: 销售日期 时间类型 数据记录 日期格式

上一篇:

下一篇:

X 关闭

X 关闭