{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### 분할 군집분석(K-평균 군집분석)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NOUnnamed: 0MenusubMenunews_from_press_title_press_content
010정치청와대파이낸셜뉴스文대통령 \"'오월 정신'은 모두의 것...국가폭력 진상 밝혀내야\"-5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18...
121정치국회/정당뉴시스통합당 \"5·18 발언 사과…희생 헛되지 않게 발벗고 나서야\"\"짐작할 수 없는 슬픔 속에 사는 유가족에 위로\"\"해야 할 일 분명…주호영 광주 방...
232정치북한한겨레보훈처, ‘6·25 참전’ 나바호족에 마스크 1만장 지원6·25 전쟁에 참전했던 미국의 원주민 나바호족 용사들에게 마스크 1만장과 손소독제...
343정치행정연합뉴스조길형 충주시장 \"수안보연수원 매입 절차 누락 내 책임\"시의회 강도 높은 질책에 \"모든 조사 겸허히 받겠다\" 사과 (충주=연합뉴스) 박...
454정치국방/외교더팩트북한 선전매체 \"5·18 대학살자들 청산해야\"북한이 5·18 민주화운동 40주년인 18일을 맞아 철저한 진상규명과 책임자들에 대...
\n", "
" ], "text/plain": [ " NO Unnamed: 0 Menu subMenu news_from \\\n", "0 1 0 정치 청와대 파이낸셜뉴스 \n", "1 2 1 정치 국회/정당 뉴시스 \n", "2 3 2 정치 북한 한겨레 \n", "3 4 3 정치 행정 연합뉴스 \n", "4 5 4 정치 국방/외교 더팩트 \n", "\n", " _press_title \\\n", "0 文대통령 \"'오월 정신'은 모두의 것...국가폭력 진상 밝혀내야\" \n", "1 통합당 \"5·18 발언 사과…희생 헛되지 않게 발벗고 나서야\" \n", "2 보훈처, ‘6·25 참전’ 나바호족에 마스크 1만장 지원 \n", "3 조길형 충주시장 \"수안보연수원 매입 절차 누락 내 책임\" \n", "4 북한 선전매체 \"5·18 대학살자들 청산해야\" \n", "\n", " _press_content \n", "0 -5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18... \n", "1 \"짐작할 수 없는 슬픔 속에 사는 유가족에 위로\"\"해야 할 일 분명…주호영 광주 방... \n", "2 6·25 전쟁에 참전했던 미국의 원주민 나바호족 용사들에게 마스크 1만장과 손소독제... \n", "3 시의회 강도 높은 질책에 \"모든 조사 겸허히 받겠다\" 사과 (충주=연합뉴스) 박... \n", "4 북한이 5·18 민주화운동 40주년인 18일을 맞아 철저한 진상규명과 책임자들에 대... " ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "from konlpy.tag import Hannanum\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "from sklearn.cluster import KMeans\n", "import numpy as np\n", "\n", "import matplotlib.pyplot as plt\n", "from matplotlib import pyplot as plt\n", "import scipy.cluster.hierarchy as shc\n", "\n", "hannanum = Hannanum()\n", "\n", "\n", "\n", "from konlpy.tag import Okt\n", "okt = Okt()\n", "\n", "\n", "# 불용어 사전 읽기\n", "# Txt 파일의 형태는 ANSI, EUC-KR로 인코딩 되어 한다.\n", "with open(\"C:\\\\Users\\\\user\\\\Documents\\\\PythonTest\\\\Dic\\\\StopWordKorean.txt\", 'r') as r_file:\n", " #파일을 연다. 문장 단위로 끊어 읽는다. \n", " kr_stop = r_file.read().splitlines()\n", " \n", "# punctuation는 [, ], ? 등 기호 리스트 이다.\n", "from string import punctuation\n", "stop_words = [set(kr_stop + list(punctuation))]\n", "\n", "#분류 대상 파일을 읽어온다\n", "Data = pd.read_csv('C:\\\\Users\\\\user\\\\Documents\\\\PythonTest\\\\Data\\\\(18-25)정치경제.csv',engine=\"python\")\n", "\n", "Data.head()\n", "\n" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "hannanum_docs= 84 okt_docs= 84\n" ] } ], "source": [ "hannanum_docs = []\n", "okt_docs = []\n", "\n", "\n", "\n", "\n", "#문장에서 명사만 추출하여 리스트로 생성 한다.\n", "for i in Data['_press_content']:\n", " hannanum_docs.append(hannanum.nouns(i))\n", "\n", "#추출된 명사 리스트를 문장을 되돌린다\n", "for i in range(len(hannanum_docs)):\n", " hannanum_docs[i] = ' '.join(hannanum_docs[i]) \n", "\n", "\n", "\n", "#문장에서 명사만 추출하여 리스트로 생성 한다.\n", "for i in Data['_press_content']:\n", " okt_docs.append(okt.nouns(i))\n", " \n", "#추출된 명사 리스트를 문장을 되돌린다\n", "for i in range(len(okt_docs)):\n", " okt_docs[i] = ' '.join(okt_docs[i]) \n", "\n", "\n", "print(\"hannanum_docs=\", len(hannanum_docs), \"okt_docs=\", len(okt_docs),) \n" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "#추출한 명사 리스트를 벡터처리 한다.\n", "vec_hannanum = CountVectorizer()\n", "X_hannanum = vec_hannanum.fit_transform(hannanum_docs)\n", "\n", "df_hannanum = pd.DataFrame(X_hannanum.toarray(), columns=vec_hannanum.get_feature_names())\n", "\n", "\n", "#추출한 명사 리스트를 벡터처리 한다.\n", "vec_okt = CountVectorizer()\n", "X_okt = vec_okt.fit_transform(okt_docs)\n", "\n", "df_okt = pd.DataFrame(X_okt.toarray(), columns=vec_okt.get_feature_names())\n" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total_hannanum = 84 sum_i_1= 83 sum_i_0= 1\n", "total_okt = 84 sum_i_1= 83 sum_i_0= 1\n" ] } ], "source": [ "#K-평균 군집분석\n", "kmeans_hannanum = KMeans(n_clusters=2).fit(df_hannanum)\n", "sum_i_1=0\n", "sum_i_0=0\n", "\n", "for i in range(len(kmeans_hannanum.labels_)) :\n", " if kmeans_hannanum.labels_[i] == 1 :\n", " sum_i_1=sum_i_1 +1\n", " else :\n", " sum_i_0= sum_i_0+1\n", " \n", "#K-평균 군집분석\n", "kmeans_okt = KMeans(n_clusters=2).fit(df_okt)\n", "sum_i_1=0\n", "sum_i_0=0\n", "\n", "for i in range(len(kmeans_okt.labels_)) :\n", " if kmeans.labels_[i] == 1 :\n", " sum_i_1=sum_i_1 +1\n", " else :\n", " sum_i_0= sum_i_0+1 \n", " \n", "print(\"total_hannanum = \",len(kmeans.labels_),\" sum_i_1=\", sum_i_1, \" sum_i_0=\",sum_i_0)\n", "\n", "\n", "print(\"total_okt = \",len(kmeans_okt.labels_),\" sum_i_1=\", sum_i_1, \" sum_i_0=\",sum_i_0)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "[0 1]\n", "1\n" ] } ], "source": [ "\n", "#클러스터 수를 확인\n", "print(kmeans.n_clusters)\n", "\n", "#클러스터의 분류 값 배열을 확인\n", "print(kmeans.labels_)\n", "\n", "#Number of iterations run.\n", "print(kmeans.n_iter_)\n", "\n", "# print(cluster.children_)\n", "\n" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
textcluster
0-5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18...0
1지난 15일 한화손해보험 임직원들이 여의도사옥에서 대회의실에서 홀몸 어르신을 위한 ...1
\n", "
" ], "text/plain": [ " text cluster\n", "0 -5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18... 0\n", "1 지난 15일 한화손해보험 임직원들이 여의도사옥에서 대회의실에서 홀몸 어르신을 위한 ... 1" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#군집화된 결과를 배열화 한다.\n", "resl = pd.DataFrame({'text' : Data['_press_content'], 'cluster' : kmeans.labels_})\n", "\n", "resl\n" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
textcluster
1지난 15일 한화손해보험 임직원들이 여의도사옥에서 대회의실에서 홀몸 어르신을 위한 ...1
0-5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18...0
\n", "
" ], "text/plain": [ " text cluster\n", "1 지난 15일 한화손해보험 임직원들이 여의도사옥에서 대회의실에서 홀몸 어르신을 위한 ... 1\n", "0 -5·18민주화운동?기념식?참석...취임?후?3번째-국가기념일?지정?후?첫?5·18... 0" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "#클러스터 순으로 소팅한다.\n", "resl2 = resl.sort_values(by=['cluster'],ascending=False)\n", "resl2\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAD6CAYAAABEUDf/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAVoklEQVR4nO3df4xd5X3n8fe3Y4yZACVgh7AMjk3W3fAjxsCEGlgpJCSDF8iQTYmEYROkRbLagABRIAZHCkWdiDZRykZxflgESqTI4E1hffNrOwQS0aIW1y4OwRjWhqZlAop/UBPQFIjt7/5xrvHYvmN7fM+dO/fM+yVZ9zzPOfc8zz2WP/f4Oec8NzITSVI1/V67OyBJah1DXpIqzJCXpAoz5CWpwgx5SaowQ16SKqy0kI+Iroh4KiJ+WC/PjognI2JDRDwYEVPLakuSdHCirPvkI+ImoBc4OjMvjYgVwEOZ+UBEfAv4RWZ+c3/7mD59es6aNauU/kjSZLFmzZotmTmj0bopZTQQET3AJcAAcFNEBPBR4Mr6JvcDdwD7DflZs2axevXqMrokSZNGRPzraOvKGq65G7gV2FkvHwdsy8zt9fIQcGJJbUmSDlLTIR8RlwKbMnPNyOoGmzYcF4qIRRGxOiJWb968udnuSJJGKONM/nygPyJ+BTxAMUxzN3BMROwaDuoBXm705sxclpm9mdk7Y0bDISVJ0iFqekw+M28DbgOIiAuAmzPzqoj438DlFMF/NbCy2bYkVc/vfvc7hoaGePPNN9vdlQlv2rRp9PT0cNhhhx30e0q58DqKzwMPRMSfA08B32lhW5I61NDQEEcddRSzZs2iuGdDjWQmW7duZWhoiNmzZx/0+0oN+cz8OfDz+vKLwDll7l9S9bz55psG/EGICI477jjGeu3SJ16lMtRqcN11xavGzIA/OIdynAx5qVm1GixcCEuXFq8GvSYQQ15q1uAgDA8Xy8PDRVkd7Y477uArX/nKmN+3bds2vvGNbxxSm0uWLOGkk07iyCOPPKT3j8aQl5rV1wfd3cVyd3dR1qR0KCGfmezcuZNPfOITrFq1qvQ+GfJSs/r7YflyuPba4rW/v9090hh997vfZe7cuZxxxhl85jOf2WPdBRdc8M50K1u2bGHX/Frr1q3jnHPOYd68ecydO5cNGzawePFiXnjhBebNm8ctt9wCwJe//GU+9KEPMXfuXL74xS8C8Ktf/YpTTjmFz33uc5x11lm89NJLzJ8/nxNOOKH0z9bKWyilyaO/33AfT7VaMSzW19f0cV+3bh0DAwM88cQTTJ8+nVdffZWvfe1rB3zft771LW644Qauuuoq3n77bXbs2MFdd93FM888w9q1awEYHBxkw4YNrFq1isykv7+fxx9/nJkzZ/L8889z3333HfLwzsEy5CV1ll0XuoeH4b77mv7f02OPPcbll1/O9OnTATj22GMP6n3nnnsuAwMDDA0N8alPfYo5c+bss83g4CCDg4OceeaZALzxxhts2LCBmTNn8r73vY/58+cfcr8PlsM1kjpLyRe6M3O/tyZOmTKFnTuLuRdHPpV75ZVXUqvVOOKII7jooot47LHHGu77tttuY+3ataxdu5aNGzdyzTXXAPCud72rqX4fLENeUmcp+UL3hRdeyIoVK9i6dSsAr7766h7rZ82axZo1xfyL3//+99+pf/HFFzn55JO5/vrr6e/v5+mnn+aoo47i9ddff2ebiy66iHvvvZc33ngDgF//+tds2rSpqf6OlSEvqbOUfKH7tNNOY8mSJXz4wx/mjDPO4Kabbtpj/c0338w3v/lNzjvvPLZs2fJO/YMPPsjpp5/OvHnzeO655/jsZz/Lcccdx/nnn8/pp5/OLbfcQl9fH1deeSXnnnsuH/zgB7n88sv3+BIY6dZbb6Wnp4fh4WF6enq44447mvpcu5T2y1Bl6O3tTX80RJpc1q9fzymnnNLubnSMRscrItZkZm+j7T2Tl6QKM+QlqcIMeUmqMENekirMkJekCjPkJanCDHlJ2st4TzU8PDzMJZdcwgc+8AFOO+00Fi9ePOZ9jMaQl6SSHOpUw1A8dPXcc8/x1FNP8cQTT/CTn/yklD4Z8pImvXZPNbx582Y+8pGPADB16lTOOusshoaGSvlszkIpqeOUONPwhJtqeNu2bfzgBz/ghhtuaO6D1RnykjpKyTMNT6iphrdv387ChQu5/vrrOfnkkw/9Q43gcI2kjlL2T+pOpKmGFy1axJw5c7jxxhub+1AjNB3yETEtIlZFxC8iYl1E/Fm9fnZEPBkRGyLiwYiY2nx3JU12Zf+k7kSZavgLX/gCr732GnfffXdzH2gvZZzJvwV8NDPPAOYBCyJiPvAXwF9l5hzg34FrSmhL0iRX9k/qToSphoeGhhgYGODZZ5/lrLPOYt68edxzzz3NfbC6Uqcajohu4O+BPwF+BLw3M7dHxLnAHZl50f7e71TD0uTjVMNj05aphiOiKyLWApuAR4AXgG2Zub2+yRBwYhltSZIOXikhn5k7MnMe0AOcAzT6Wm74X4aIWBQRqyNi9ebNm8vojiSprtS7azJzG/BzYD5wTETsukWzB3h5lPcsy8zezOydMWNGmd2R1CEm0i/UTWSHcpzKuLtmRkQcU18+AvgYsB74GXB5fbOrgZXNtiWpeqZNm8bWrVsN+gPITLZu3cq0adPG9L4yHoY6Abg/IroovjRWZOYPI+JZ4IGI+HPgKeA7JbQlqWJ6enoYGhrC4doDmzZtGj09PWN6T9Mhn5lPA2c2qH+RYnxekkZ12GGHMXv27HZ3o7J84lWSKsyQl6QKM+QlqcIMeUmqMENekirMkJekCjPkJanCDHlJqjBDXpIqzJCXpAoz5CWpwgx5SaowQ16SKsyQl6QKM+QlqcIMeUmqMENekirMkJekCjPkJanCDHlJqjBDXpIqzJCXpAoz5NX5ajW47rriVdIemg75iDgpIn4WEesjYl1E3FCvPzYiHomIDfXXdzffXWkvtRosXAhLlxavBr20hzLO5LcDf5qZpwDzgWsj4lRgMfBoZs4BHq2XpXINDsLwcLE8PFyUJb2j6ZDPzFcy85/ry68D64ETgcuA++ub3Q98stm2pH309UF3d7Hc3V2UJb1jSpk7i4hZwJnAk8DxmfkKFF8EEfGeMtuSAOjvh+XLizP4vr6iLOkdpYV8RBwJ/A1wY2b+NiIO9n2LgEUAM2fOLKs7mkz6+w13aRSl3F0TEYdRBPz3MvOhevVvIuKE+voTgE2N3puZyzKzNzN7Z8yYUUZ3JEl1ZdxdE8B3gPWZ+dURq2rA1fXlq4GVzbYlSRqbMoZrzgc+A/wyItbW624H7gJWRMQ1wL8Bny6hLUnSGDQd8pn598BoA/AXNrt/SdKh84lXSaowQ16SKsyQl6QKM+QlqcIMeUmqMENekirMkJekCjPkJanCDHlJqjBDXpIqzJCXpAoz5CWpwgx5SaowQ16SKsyQl6QKM+QlqcIMeUmqMENekirMkJekCjPkJanCDHlJqjBDXpIqzJCXpAorJeQj4t6I2BQRz4yoOzYiHomIDfXXd5fRliTp4JV1Jv/XwIK96hYDj2bmHODRelmSNI5KCfnMfBx4da/qy4D768v3A58soy1J0sFr5Zj88Zn5CkD99T0tbEuS1EDbL7xGxKKIWB0Rqzdv3tzu7khSpbQy5H8TEScA1F83NdooM5dlZm9m9s6YMaOF3ZGkyaeVIV8Drq4vXw2sbGFbkqQGyrqFcjnwD8B/iYihiLgGuAv4eERsAD5eL0uSxtGUMnaSmQtHWXVhGfuvtFoNBgehrw/6+9vdG0kV0/YLr5NarQYLF8LSpcVrrdbuHkmqGEO+nQYHYXi4WB4eLsqSVCJDvp36+qC7u1ju7i7KklSiUsbkdYj6+2H5csfkJbWMId9u/f2Gu6SWcbhGkirMkJekCjPkJanCDPmJqFaD667zvnlJTTPkJxofkJJUIkN+ovEBKUklMuQnGh+QklQi75OfaHxASlKJDPmJyAekJJXE4RpJqjBDXpIqzJCXpAoz5CWpwgx5SaowQ36icmoDSSUw5CcipzaQVJLq3ydfq8Gdd8KWLXDVVfCHf1g8aPT7vw+vvbb7idKJ9PBRo6kNJkK/JHWc6oR8rbZvUNdq8Ed/BNu3F+UvfQmmTNldBrjnHsiEt9+G++4rnjZtd6D29RV9GR52agNJTWl5yEfEAuB/AV3APZl5V+mN7BreGB7eHdRPPglf/eqegQ77lt96a/fyRDlrdmoDSSVpachHRBewFPg4MAT8U0TUMvPZUhv69rf3HN644gr4j/8YrVPFmXsjE+ms2akNJJWg1RdezwE2ZuaLmfk28ABwWakt1Grw6KN71o0W8FAE/O81+NgRcOONBqukSml1yJ8IvDSiPFSvK8/g4J5DLgdj58596zKLC7GSVCGtDvloULfHWElELIqI1RGxevPmzWNvoazhla6uiTNUI0klaXXIDwEnjSj3AC+P3CAzl2Vmb2b2zpgxY+wt9PfD0Uc31cl6R4qLtbvs72Gk0db5AJOkiSYzW/aH4sLui8BsYCrwC+C00bY/++yz85DcfntmEdPN/enqKvZ18cWZU6cWdd3dmStX7m5r5cqibu91o9VLUosBq3OUXG3pmXxmbgeuA/4WWA+syMx1pTc0MAC33w6nnw7nnVec2Te6uHogO3bAX/4l/PjHxX3zsO/vrI72G6z+NqukCajl0xpk5o8z8w8y8/2ZOdCyhgYG4Je/hCeeKC6gPvzw7t9K7eoqwr+ra//76Ora9z76vW+r7OuDqVOL5alTd6/zt1klTUDVnbtm5Jn1jh1w5pnw0ENw8cXw/vfvue3RR8PZZ8PnP787qA8/vNi20ROwEXu+wu4HmK69dmI8NStJVDnkG51Z9/fDj34ECxbsue1vfwtr1hTLu4J6xYpi273DeuQtm2+9teewTH8/fP3rBrykCaO6Ib+/M+uRXwAj1WoHDmqHZSR1kOpMUNbIaFMD7PoCuPPO3Wfwu+oPZp/OKyOpQ0SONo9LG/T29ubq1avHt9ElS3afwQ+07rqwJLVKRKzJzN6G6yZ9yEtSh9tfyFd3TF7l8UleqWMZ8mWocgj6U4RSRzPkm1X1EPRJXqmjGfLNqnoIesuo1NEM+WZVPQR9klfqaN5dU4ZGPyIuSeNkf3fXVPthqPHSzO+xdsIXRCf0UVJDDte0UydctO2EPkoalSHfTp1w0bYT+ihpVIZ8O3XCRdtO6KOkUTkm3y675sy5+GI4/viJO97thGxSRzPk22HJEvjSl4rlZ54pfrpwIodnMxeWJbWVwzXtsPfFSy9mSmoRQ74d9j4r9ixZUos4XNMOu+atdx57SS3mE6+S1OGcT16SJqmmQj4iPh0R6yJiZ0T07rXutojYGBHPR8RFzXVTknQomh2Tfwb4FPDtkZURcSpwBXAa8J+An0bEH2TmjibbkySNQVNn8pm5PjOfb7DqMuCBzHwrM/8F2Aic00xbkqSxa9WY/InASyPKQ/U6SdI4OuBwTUT8FHhvg1VLMnPlaG9rUNfwNp6IWAQsApg5c+aBuiNJGoMDhnxmfuwQ9jsEnDSi3AO8PMr+lwHLoLiF8hDakiSNolXDNTXgiog4PCJmA3OAVS1qS5I0imZvofzvETEEnAv8KCL+FiAz1wErgGeB/wtc6501kjT+mrqFMjMfBh4eZd0A4PP6ktRGPvEqSRVmyEtShRnyklRhhrwkVZghL0kVZshLUoUZ8pJUYYa8JFWYIS9JFWbIS1KFGfKSVGGGvCRVmCEvSRVmyEtShRnyklRhhrwkVZghL0kVZshLUoUZ8pJUYYa8JFWYIS9JFWbIS1KFGfKSVGFNhXxEfDkinouIpyPi4Yg4ZsS62yJiY0Q8HxEXNd9VSdJYNXsm/whwembOBf4fcBtARJwKXAGcBiwAvhERXU22JUkao6ZCPjMHM3N7vfiPQE99+TLggcx8KzP/BdgInNNMW5KksStzTP5/Aj+pL58IvDRi3VC9TpI0jqYcaIOI+Cnw3garlmTmyvo2S4DtwPd2va3B9jnK/hcBiwBmzpx5EF2WJB2sA4Z8Zn5sf+sj4mrgUuDCzNwV5EPASSM26wFeHmX/y4BlAL29vQ2/CCRJh6bZu2sWAJ8H+jNzeMSqGnBFRBweEbOBOcCqZtqSJI3dAc/kD+DrwOHAIxEB8I+Z+ceZuS4iVgDPUgzjXJuZO5psS5I0Rk2FfGb+5/2sGwAGmtm/JKk5PvEqSRVmyEtShRnyklRhhrwkVZghL0kVZshLUhvVanDddcVrKxjyktQmtRosXAhLlxavrQh6Q16S2mRwEIbrcwUMDxflshnyktQmfX3Q3V0sd3cX5bI1O62BJOkQ9ffD8uXFGXxfX1EumyEvSW3U39+acN/F4RpJqjBDXpIqzJCXpAoz5CWpwgx5SaowQ16SKsyQl6QKi8xsdx/eERGbgX9tdz/qpgNb2t2JNvMYeAzAYwAT/xi8LzNnNFoxoUJ+IomI1ZnZ2+5+tJPHwGMAHgPo7GPgcI0kVZghL0kVZsiPblm7OzABeAw8BuAxgA4+Bo7JS1KFeSYvSRVmyDcQEQsi4vmI2BgRi9vdn/EQEfdGxKaIeGZE3bER8UhEbKi/vrudfWy1iDgpIn4WEesjYl1E3FCvnxTHISKmRcSqiPhF/fP/Wb1+dkQ8Wf/8D0bE1Hb3tdUioisinoqIH9bLHXsMDPm9REQXsBT4b8CpwMKIOLW9vRoXfw0s2KtuMfBoZs4BHq2Xq2w78KeZeQowH7i2/nc/WY7DW8BHM/MMYB6wICLmA38B/FX98/87cE0b+zhebgDWjyh37DEw5Pd1DrAxM1/MzLeBB4DL2tynlsvMx4FX96q+DLi/vnw/8Mlx7dQ4y8xXMvOf68uvU/wjP5FJchyy8Ea9eFj9TwIfBb5fr6/s598lInqAS4B76uWgg4+BIb+vE4GXRpSH6nWT0fGZ+QoUAQi8p839GTcRMQs4E3iSSXQc6sMUa4FNwCPAC8C2zNxe32Qy/Hu4G7gV2FkvH0cHHwNDfl/RoM5bkCaRiDgS+Bvgxsz8bbv7M54yc0dmzgN6KP5Xe0qjzca3V+MnIi4FNmXmmpHVDTbtmGPgb7zuawg4aUS5B3i5TX1pt99ExAmZ+UpEnEBxdldpEXEYRcB/LzMfqldPuuOQmdsi4ucU1yaOiYgp9TPZqv97OB/oj4iLgWnA0RRn9h17DDyT39c/AXPqV9OnAlcAtTb3qV1qwNX15auBlW3sS8vVx16/A6zPzK+OWDUpjkNEzIiIY+rLRwAfo7gu8TPg8vpmlf38AJl5W2b2ZOYsin/7j2XmVXTwMfBhqAbq3+J3A13AvZk50OYutVxELAcuoJht7zfAF4H/A6wAZgL/Bnw6M/e+OFsZEfFfgb8Dfsnu8djbKcblK38cImIuxUXFLooTwBWZeWdEnExxA8KxwFPA/8jMt9rX0/ERERcAN2fmpZ18DAx5Saowh2skqcIMeUmqMENekirMkJekCjPkJanCDHlJqjBDXpIqzJCXpAr7/ybhF0po1rO7AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "from sklearn.decomposition import PCA\n", "import matplotlib.pyplot as plt\n", "\n", "\n", "\n", "\n", "\n", "pca = PCA(n_components=2)\n", "principalComponents = pca.fit_transform(df)\n", "principalDf = pd.DataFrame(data = principalComponents\n", " , columns = ['principal component 1', 'principal component 2'])\n", "\n", "\n", "\n", "\n", "\n", "principalDf.index=Data['_press_content']\n", "\n", "\n", "\n", "\n", "\n", "kmeans.labels_ == 0\n", "\n", "\n", "\n", "\n", "\n", "# x축 : first y출 : second 번호로 나타낸후 plot으로 시각화\n", "plt.scatter(principalDf.iloc[kmeans.labels_ == 0, 0], principalDf.iloc[kmeans.labels_ == 0, 1], s = 10, c = 'red', label = 'cluster1')\n", "plt.scatter(principalDf.iloc[kmeans.labels_ == 1, 0], principalDf.iloc[kmeans.labels_ == 1, 1], s = 10, c = 'blue', label = 'cluster2')\n", "#plt.scatter(principalDf.iloc[kmeans.labels_ == 2, 0], principalDf.iloc[kmeans.labels_ == 2, 1], s = 10, c = 'green', label = 'cluster3')\n", "plt.legend()\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }