Unveiling the Dynamics: A Python Analysis of the Dhaka Stock Market


Welcome to the world of Dhaka stock market analysis! In this blog, we'll delve into the fascinating realm of stock market trends, data analysis, and insights specific to the Dhaka stock market. Through the lens of Python and Jupyter-notebook, we'll uncover valuable information and gain a deeper understanding of the dynamics driving the Dhaka stock market. Join me on this insightful journey as we explore the intricacies of stock market analysis and its implications for investors and enthusiasts alike.

Dataset


Here is the stock market dataset for your analysis:

Dhaka Stock Market Dataset

Feel free to explore this comprehensive dataset to gain valuable insights into the dynamics of the Dhaka stock market.

Understanding the Analysis Code


Let's Explain the code block by block

  1. Import the necessary library


    These imported libraries are fundamental for the analysis

    • numpy: Used for numerical operations and array manipulations.

    • pandas: Essential for data manipulation and analysis, providing

      powerful data structures and tools.

    • matplotlib.pyplot: Enables the creation of visualizations such as plots, charts, and graphs.

    • seaborn: A powerful library for statistical data visualization, enhancing the aesthetics and overall appeal of visualizations.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
  1. Loading the dataset with pandas


    In this section, the stock market data is imported into the Jupyter-notebook using the read_csv() function from the pandas library. Subsequently, the head() function is utilized to display the initial rows of the dataset, offering a glimpse into the structure and contents of the Dhaka stock market data

df = pd.read_csv("../input/stock-market-data/Stock_Market_Data.csv")
df.head()

  1. Convert the 'Date' column in the DataFrame to a DateTime format


    This line of code converts the 'Date' column in the stock_data DataFrame to a DateTime format, with the 'dayfirst' parameter set to True. This conversion is essential for time-series analysis and ensures that the date data is interpreted accurately within the analysis.

stock_data['Date'] = pd.to_datetime(stock_data['Date'],dayfirst=True)
  1. Calculate basic summary statistics for each column (mean, median, standard deviation, etc.)


    The code computes the summary statistics for the stock_data DataFrame using the describe() function, providing key statistical metrics such as count, mean, standard deviation, and minimum, and maximum values for each numerical column in the dataset.

summ_stat = stock_data.describe()
summ_stat

  1. Selecting Key Companies from the Stock Market Dataset


    With so many companies in the dataset, I'll handpick a selection to dive into. This way, we can really get to know these companies and understand their impact on the market as a whole. You can find all the company names with this code.

     df['Name'].unique()
    

    I have selected five companies below.

     ## randomly select 5 company
     unique_names = ['01.Bank', '02.Cement', '03.Ceramics_Sector', '04.Engineering',
            '05.Financial_Institutions']
    
  2. Explore the distribution of the 'Close' prices over time


    This code iterates through unique company names in the dataset, creates individual data for each company, and then generates a histogram of the closing prices over time for each company. It will show a total of 5 companies. two of them are given below The resulting visualizations offer a clear depiction of the distribution of close prices for the selected companies, enabling a comparative analysis of their stock performance.

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    plt.figure(figsize=(15,5))
    sns.histplot(data=company_data ,x="Close",bins=30,label=name)
    plt.xlabel("Closing Price Distribution")
    plt.ylabel("Frequency")
    plt.title("Distribution of Close Prices Over Time of {}".format(name))
    plt.legend()
    plt.xticks(rotation=45)
    plt.show()

  1. Identify and analyze any outliers (if any) in the dataset


    This code iterates through unique company names in the dataset, creates individual data for each company, and then generates box plots to visualize the distribution of numerical variables, showcasing any potential outliers for each company. These visualizations provide valuable insights into the spread and potential anomalies within the data for the selected companies.

for name in unique_names:
    company_data = stock_data[stock_data['Name']==name]
    plt.figure(figsize=(15,5))
    sns.boxplot(data=company_data.select_dtypes(include=np.number))
    plt.title(f'Box Plots for {name} - Outliers')
    plt.legend()
    plt.xticks(rotation=45)
    plt.show()

Output (2 out of 5)

  1. Create a line chart to visualize the 'Close' prices over time


    This code iterates through unique company names in the dataset, creates individual data for each company, and then generates line plots to visualize the trend of closing prices over time for each company. These visualizations offer a clear representation of how the closing prices have evolved for the selected companies, providing insights into their historical performance.

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    plt.figure(figsize=(15, 4))
    plt.plot(company_data['Date'],company_data['Close'],label ="{}".format(name))
    plt.xlabel('Date')
    plt.ylabel('Closing Price')
    plt.title('Close Prices Over Time of {}'.format(name))
    plt.legend()
    plt.xticks(rotation=45)
    plt.show()

Output (2 out of 5)

  1. Calculate and plot the daily percentage change in closing prices


    This code iterates through unique company names in the dataset, calculates the daily percentage change in closing prices for each company, and then generates line plots to visualize these changes over time. These visualizations offer insights into the daily fluctuations in closing prices for the selected companies, aiding in the assessment of their market volatility and performance.

for name in unique_names:
    plt.figure(figsize=(15, 4))
    company_data = stock_data[stock_data['Name'] == name]
    company_data['Daily_PCT_Change'] = company_data['Close'].pct_change() * 100
    plt.plot(company_data['Date'], company_data['Daily_PCT_Change'], label=name)
    plt.xlabel('Date')
    plt.ylabel('Daily Percentage Change')
    plt.title('Daily Percentage Change in Closing Prices of {}'.format(name))
    plt.legend()
    plt.xticks(rotation=45)
    plt.show()

Output (2 out of 5)


  1. This code iterates through unique company names in the dataset, plots the closing prices over time for each company, and overlays a rolling average (e.g., 30 days) to visualize the trend. These visualizations provide insights into both the daily fluctuations and the long-term trends in closing prices for the selected companies, aiding in the assessment of their stock performance.

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    plt.plot(company_data['Date'],company_data['Close'], label=name)
    # Plotting a rolling average (e.g., 30 days) for trend visualization
    rolling_avg = company_data['Close'].rolling(window=30).mean()
    plt.plot(company_data['Date'],rolling_avg, label=f'{name} - Trend Line', linestyle='--')
    plt.title('Stock Prices Trend Line Over Time')
    plt.xlabel('Date')
    plt.ylabel('Closing Price')
    plt.legend()
    plt.show()

Output (2 out of 5)

  1. Apply moving averages to smooth the time series data in 15/30 day intervals against the original graph


    This code iterates through unique company names in the dataset, calculates the 15-day and 30-day moving averages for each company's closing prices, and then generates visualizations to compare the original closing prices with their respective moving averages over time. These visualizations offer insights into the trend and direction of the stock prices for the selected companies, aiding in the analysis of their historical performance.

for name in unique_names:
    plt.figure(figsize=(15, 5))
    company_data = stock_data[stock_data['Name'] == name]
    company_data['15_Day_MA'] = company_data['Close'].rolling(window=15).mean()
    company_data['30_Day_MA'] = company_data['Close'].rolling(window=30).mean()

    # Plotting for the current company
    plt.plot(company_data['Date'], company_data['Close'], label=f'Original Close - {name}',color="red")
    plt.plot(company_data['Date'], company_data['15_Day_MA'], label=f'15-Day MA - {name}', linestyle='--',color="blue")
    plt.plot(company_data['Date'], company_data['30_Day_MA'], label=f'30-Day MA - {name}', linestyle='--',color="green")
    plt.xlabel('Date')
    plt.ylabel('Close Price')
    plt.title('Original Time Series with Moving Averages {}'.format(name))
    plt.legend()
    plt.show()

Output (2 out of 5)

11.1.Calculate the average closing price for each stock


The code creates a new DataFrame, 'df', by calculating the average closing price for each company from the 'stock_data' DataFrame using the 'groupby()' function. The resulting DataFrame contains two columns: 'Name' and 'AvgClosingPrice', displaying the average closing price for each company.

df = pd.DataFrame(stock_data.groupby('Name')['Close'].mean()).reset_index()
df.columns=['Name','AvgClosingPrice']
df.head()

  1. Identify the top 5 stocks based on the average closing price


df_sorted = df.sort_values(by='AvgClosingPrice',ascending=False)
top5 = df_sorted.head(5)
top5
  1. Identify the bottom 5 stocks based on the average closing price


df_sorted = df.sort_values(by='AvgClosingPrice',ascending=False)
bottom5 = df_sorted.tail(5)
bottom5
  1. Calculate and plot the rolling standard deviation of the 'Close' prices


    The code iterates through unique company names, retrieves stock data for each company, and creates a plot showing the rolling standard deviation of their closing prices over time using a 30-day window. This visualization helps compare the volatility of stock prices among different companies.

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    plt.figure(figsize=(15, 8))
    rolling_std = company_data['Close'].rolling(window=30).std()
    plt.plot(company_data['Date'],rolling_std, label=f'Rolling Std (30 days) - {name}',linestyle='--',color="blue")
    plt.xlabel('Date')
    plt.ylabel('Rolling Standard Deviation')
    plt.title('Rolling Standard Deviation of Close Prices for Each Company')
    plt.legend()
    plt.xticks(rotation=45)
    plt.show()

Output(2 out of 5)

  1. Daily price change (Close - Open)


stock_data['daily_price_change'] = stock_data['Close'] - stock_data['Open']
stock_data.head()

  1. Analyze the distribution of daily price changes


    This code iterates through unique company names, retrieves stock data for each company, and creates a histogram showing the distribution of daily price changes over time for each company. The Seaborn library is used to generate the histograms, with each company's data represented by a different color. The visualization provides an overview of the frequency and distribution of daily price changes for each company.

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    sns.histplot(data=company_data, x='daily_price_change', bins=30, label=name)
    plt.xlabel('Daily Price Chnage')
    plt.ylabel('Frequency')
    plt.title('Distribution of Daily Price Chnage Over Time {}'.format(name))
    plt.legend()
    plt.show()

Output(2 out of 5)

  1. Identify days with the largest price increases and decreases


sorted_data = stock_data.sort_values("daily_price_change",ascending=False)
largest_price_increase = sorted_data.head(1)
largest_price_increase
sorted_data = stock_data.sort_values("daily_price_change",ascending=False)
largest_price_decrease = sorted_data.tail(1)
largest_price_decrease
  1. Identify stocks with unusually high trading volume on certain days


    This code iterates through unique company names, retrieves stock data for each company, and creates a plot showing the trading volume over time. It also highlights unusually high volume days with red markers. The visualization aims to emphasize and compare the trading volume patterns for different companies, particularly focusing on days with exceptionally high trading volume

for name in unique_names:
    company_data = stock_data[stock_data['Name'] == name]
    plt.plot(company_data['Date'],company_data['Volume'],label=name)
    threshold = company_data['Volume'].quantile(0.95)
    high_volume_data = company_data[company_data['Volume'] > threshold]
    plt.scatter(high_volume_data['Date'],high_volume_data['Volume'],color="red",marker='o',label="{} - High Volume Days".format(name))
    plt.title('Trading Volume Over Time with Emphasis on Unusually High Volume Days')
    plt.xlabel('Date')
    plt.ylabel('Trading Volume')
    plt.legend()
    plt.show()

Output (2 out of 5)

  1. Explore the relationship between trading volume and volatility


    This code calculates the volatility of stock prices and creates a correlation heatmap for each company. It first calculates the volatility as the percentage change in closing prices and then computes the correlation between volume and volatility for each company. Subsequently, it generates a series of correlation heatmaps, one for each company, to visualize the relationship between trading volume and stock price volatility.

stock_data['Volatility'] = stock_data.groupby("Name")['Close'].pct_change()
correlation_matrix = stock_data.groupby("Name")[['Volume','Volatility']].corr()
for name in unique_names:
    plt.figure(figsize=(4,4))
    sns.heatmap(correlation_matrix.loc[name],annot=True)
    plt.title("Correlation Heatmap of {}".format(name))
    plt.show()

Output (2 out of 5)

  1. Correlation matrix between the 'Open' & 'High', 'Low' &'Close' prices


price_columns = ['Open', 'High', 'Low', 'Close']

# Group by 'Name' and calculate the correlation matrix for each group
correlation_matrices = stock_data.groupby('Name')[price_columns].corr()
for name in unique_names:
    print("Company Name : {}".format(name))
    print("-"*50)
    print(correlation_matrices.loc[name])
    print("-"*50)
  1. Heatmap to visualize the correlations using the seaborn package


for name in unique_names:
    plt.figure(figsize=(4,4))
    sns.heatmap(correlation_matrices.loc[name],annot=True)
    plt.title("Correlation Heatmap of {}".format(name))
    plt.show()

Output(2 out of 5)