python - Why does my plot have criss-crossing lines when I convert the index from string to datetime? - Stack Overflow
- c - Solaris 10 make Error code 1 Fatal Error when trying to build python 2.7.16 - Stack Overflow 推荐度:
- javascript - How to dismiss a phonegap notification programmatically - Stack Overflow 推荐度:
- javascript - Get the JSON objects that are not present in another array - Stack Overflow 推荐度:
- javascript - VS 2015 Angular 2 import modules cannot be resolved - Stack Overflow 推荐度:
- javascript - Type 'undefined' is not assignable to type 'menuItemProps[]' - Stack Overflow 推荐度:
- 相关推荐
I am trying to plot a time series with plt.plot() but I am constantly observing some strange output. A sample of the dataset I am providing below (the whole dataset has roughly 150,000 entries). One of the columns of this dataset consists of time values and serves as an index. Depending on whether I convert the index from a string to a datetime object or not, I am getting two different outputs when plotting against the target variable.
Scenario 1: string index
import matplotlib.pyplot as plt
import pandas as pd
sample_df = pd.DataFrame()
sample_df["Date"] = [ "2002-12-31 01:00:00", "2002-06-06 10:00:00", "2003-11-10 19:00:00",
"2003-04-15 04:00:00", "2004-09-19 14:00:00", "2004-02-24 23:00:00",
"2005-07-30 08:00:00", "2005-01-03 17:00:00", "2006-06-08 02:00:00",
"2007-11-12 11:00:00", "2007-04-18 20:00:00", "2008-09-21 06:00:00",
"2008-02-26 15:00:00", "2009-08-03 00:00:00", "2009-01-05 09:00:00",
"2010-06-11 19:00:00", "2011-11-14 04:00:00", "2011-04-20 13:00:00",
"2012-09-24 23:00:00", "2012-02-28 08:00:00", "2013-08-04 17:00:00",
"2013-01-07 02:00:00", "2014-06-13 09:00:00", "2015-11-17 18:00:00",
"2015-04-22 01:00:00", "2016-09-26 09:00:00", "2016-03-02 18:00:00",
"2017-08-06 01:00:00", "2017-01-10 10:00:00", "2018-01-16 19:00:00" ]
sample_df["Energy_MW"] = [ 26498.0, 39167.0, 36614.0, 21837.0, 26644.0,
33574.0, 30255.0, 33781.0, 24344.0, 34708.0,
33996.0, 21127.0, 36255.0, 31982.0, 35448.0,
37066.0, 22116.0, 31326.0, 26569.0, 33565.0,
34649.0, 25709.0, 33516.0, 33032.0, 22333.0,
28064.0, 33905.0, 25304.0, 41505.0, 39543.0 ]
sample_df = sample_df.set_index("Date")
# Basic plot.
fig = plt.figure( figsize = (10,5) )
plt.plot(sample_df.index, sample_df["Energy_MW"], 'b')
plt.grid(True)
plt.show()
This is the plot corresponding to a string index:
Scenario 2: datetime index
import matplotlib.pyplot as plt
import pandas as pd
sample_df = pd.DataFrame()
sample_df["Date"] = [ "2002-12-31 01:00:00", "2002-06-06 10:00:00", "2003-11-10 19:00:00",
"2003-04-15 04:00:00", "2004-09-19 14:00:00", "2004-02-24 23:00:00",
"2005-07-30 08:00:00", "2005-01-03 17:00:00", "2006-06-08 02:00:00",
"2007-11-12 11:00:00", "2007-04-18 20:00:00", "2008-09-21 06:00:00",
"2008-02-26 15:00:00", "2009-08-03 00:00:00", "2009-01-05 09:00:00",
"2010-06-11 19:00:00", "2011-11-14 04:00:00", "2011-04-20 13:00:00",
"2012-09-24 23:00:00", "2012-02-28 08:00:00", "2013-08-04 17:00:00",
"2013-01-07 02:00:00", "2014-06-13 09:00:00", "2015-11-17 18:00:00",
"2015-04-22 01:00:00", "2016-09-26 09:00:00", "2016-03-02 18:00:00",
"2017-08-06 01:00:00", "2017-01-10 10:00:00", "2018-01-16 19:00:00" ]
sample_df["Energy_MW"] = [ 26498.0, 39167.0, 36614.0, 21837.0, 26644.0,
33574.0, 30255.0, 33781.0, 24344.0, 34708.0,
33996.0, 21127.0, 36255.0, 31982.0, 35448.0,
37066.0, 22116.0, 31326.0, 26569.0, 33565.0,
34649.0, 25709.0, 33516.0, 33032.0, 22333.0,
28064.0, 33905.0, 25304.0, 41505.0, 39543.0 ]
sample_df = sample_df.set_index("Date")
sample_df.index = pd.to_datetime(sample_df.index)
# Basic plot.
fig = plt.figure( figsize = (10,5) )
plt.plot(sample_df.index, sample_df["Energy_MW"], 'b')
plt.grid(True)
plt.show()
And this is the plot corresponding to a datetime index:
Why does the conversion to a datetime object drastically make the results worse? How to resolve the issue? Please explain in simple words. I tried to obtain graphical results in both of the scenarios in order to locate the source of the problem. I would wish to see some ideas on how to get the correct plot when the index as well has the correct data type (datetime). I also came across a YouTube video covering the same dataset and the same time series. In the video I did not see any wrong plot despite the fact the index of the dataframe was converted to the correct datatype.
I am trying to plot a time series with plt.plot() but I am constantly observing some strange output. A sample of the dataset I am providing below (the whole dataset has roughly 150,000 entries). One of the columns of this dataset consists of time values and serves as an index. Depending on whether I convert the index from a string to a datetime object or not, I am getting two different outputs when plotting against the target variable.
Scenario 1: string index
import matplotlib.pyplot as plt
import pandas as pd
sample_df = pd.DataFrame()
sample_df["Date"] = [ "2002-12-31 01:00:00", "2002-06-06 10:00:00", "2003-11-10 19:00:00",
"2003-04-15 04:00:00", "2004-09-19 14:00:00", "2004-02-24 23:00:00",
"2005-07-30 08:00:00", "2005-01-03 17:00:00", "2006-06-08 02:00:00",
"2007-11-12 11:00:00", "2007-04-18 20:00:00", "2008-09-21 06:00:00",
"2008-02-26 15:00:00", "2009-08-03 00:00:00", "2009-01-05 09:00:00",
"2010-06-11 19:00:00", "2011-11-14 04:00:00", "2011-04-20 13:00:00",
"2012-09-24 23:00:00", "2012-02-28 08:00:00", "2013-08-04 17:00:00",
"2013-01-07 02:00:00", "2014-06-13 09:00:00", "2015-11-17 18:00:00",
"2015-04-22 01:00:00", "2016-09-26 09:00:00", "2016-03-02 18:00:00",
"2017-08-06 01:00:00", "2017-01-10 10:00:00", "2018-01-16 19:00:00" ]
sample_df["Energy_MW"] = [ 26498.0, 39167.0, 36614.0, 21837.0, 26644.0,
33574.0, 30255.0, 33781.0, 24344.0, 34708.0,
33996.0, 21127.0, 36255.0, 31982.0, 35448.0,
37066.0, 22116.0, 31326.0, 26569.0, 33565.0,
34649.0, 25709.0, 33516.0, 33032.0, 22333.0,
28064.0, 33905.0, 25304.0, 41505.0, 39543.0 ]
sample_df = sample_df.set_index("Date")
# Basic plot.
fig = plt.figure( figsize = (10,5) )
plt.plot(sample_df.index, sample_df["Energy_MW"], 'b')
plt.grid(True)
plt.show()
This is the plot corresponding to a string index:
Scenario 2: datetime index
import matplotlib.pyplot as plt
import pandas as pd
sample_df = pd.DataFrame()
sample_df["Date"] = [ "2002-12-31 01:00:00", "2002-06-06 10:00:00", "2003-11-10 19:00:00",
"2003-04-15 04:00:00", "2004-09-19 14:00:00", "2004-02-24 23:00:00",
"2005-07-30 08:00:00", "2005-01-03 17:00:00", "2006-06-08 02:00:00",
"2007-11-12 11:00:00", "2007-04-18 20:00:00", "2008-09-21 06:00:00",
"2008-02-26 15:00:00", "2009-08-03 00:00:00", "2009-01-05 09:00:00",
"2010-06-11 19:00:00", "2011-11-14 04:00:00", "2011-04-20 13:00:00",
"2012-09-24 23:00:00", "2012-02-28 08:00:00", "2013-08-04 17:00:00",
"2013-01-07 02:00:00", "2014-06-13 09:00:00", "2015-11-17 18:00:00",
"2015-04-22 01:00:00", "2016-09-26 09:00:00", "2016-03-02 18:00:00",
"2017-08-06 01:00:00", "2017-01-10 10:00:00", "2018-01-16 19:00:00" ]
sample_df["Energy_MW"] = [ 26498.0, 39167.0, 36614.0, 21837.0, 26644.0,
33574.0, 30255.0, 33781.0, 24344.0, 34708.0,
33996.0, 21127.0, 36255.0, 31982.0, 35448.0,
37066.0, 22116.0, 31326.0, 26569.0, 33565.0,
34649.0, 25709.0, 33516.0, 33032.0, 22333.0,
28064.0, 33905.0, 25304.0, 41505.0, 39543.0 ]
sample_df = sample_df.set_index("Date")
sample_df.index = pd.to_datetime(sample_df.index)
# Basic plot.
fig = plt.figure( figsize = (10,5) )
plt.plot(sample_df.index, sample_df["Energy_MW"], 'b')
plt.grid(True)
plt.show()
And this is the plot corresponding to a datetime index:
Why does the conversion to a datetime object drastically make the results worse? How to resolve the issue? Please explain in simple words. I tried to obtain graphical results in both of the scenarios in order to locate the source of the problem. I would wish to see some ideas on how to get the correct plot when the index as well has the correct data type (datetime). I also came across a YouTube video covering the same dataset and the same time series. In the video I did not see any wrong plot despite the fact the index of the dataframe was converted to the correct datatype.
Share Improve this question asked 15 hours ago Lyudmil YovkovLyudmil Yovkov 111 silver badge3 bronze badges1 Answer
Reset to default 1you need sort
sample_df = sample_df.set_index("Date").sort_index() # sort
# same with your code
sample_df.index = pd.to_datetime(sample_df.index)
# Basic plot.
fig = plt.figure( figsize = (10,5) )
plt.plot(sample_df.index, sample_df["Energy_MW"], 'b')
plt.grid(True)
plt.show()
- Rambus联手微软:研究量子计算内存
- 谷歌的成功与失误:安卓迎变化眼镜前途未卜
- 苹果安卓交锋:开放做强生态链
- Windows 8取胜机会:10个值得考虑的因素
- 奇虎起诉腾讯:“中国互联网反垄断第一案”今日开庭
- Why does this typecheck in Lean? - Stack Overflow
- responsive design - How achieve given UI using custom clipper in flutter - Stack Overflow
- Banning telegram channels from groups as a bot - Stack Overflow
- javascript - How to delete content from content-editable field on facebookmessenger.com? - Stack Overflow
- python - hydra submitit launcher plugin fails to import modules that can be imported normally when omitting the plugin - Stack O
- python - How can I use FilteredSelectMultiple widget in a django custom form? - Stack Overflow
- regex - Change text block with nearest search pattern - Stack Overflow
- combinatorics - MiniZinc - Optimizing Script for Scheduling Problem - Stack Overflow
- dolphindb - Whitepaper error: “cannot recognize the token b” - Stack Overflow
- react native - How to convert location of each corner relative to the Camera Preview to the device view? - Stack Overflow
- javascript - <function name> error: FirebaseError: Firebase: No Firebase App '[DEFAULT]' has been
- swift - Codable class does not conform to protocol Decodable because of TimeStamp, new Error wasn't there before - Stack