数据分析项目-房屋价格预测.zip
资源内容介绍
内含自写的源代码,下载的参考文献,完整论文和处理前后的数据:1.源代码:提供完整的数据分析流程代码,便于复现分析结果2.参考文献:列出了所有参考的文献资料,方便深入了解相关理论和背景3.完整论文:提供了上万字的详细论文,深入探讨了数据分析的各个方面4.处理前后的数据:分享了原始数据及经过处理后的数据,供进行对比和进一步分析。 <link href="/image.php?url=https://csdnimg.cn/release/download_crawler_static/css/base.min.css" rel="stylesheet"/><link href="/image.php?url=https://csdnimg.cn/release/download_crawler_static/css/fancy.min.css" rel="stylesheet"/><link href="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/raw.css" rel="stylesheet"/><div id="sidebar" style="display: none"><div id="outline"></div></div><div class="pf w0 h0" data-page-no="1" id="pf1"><div class="pc pc1 w0 h0"><img alt="" class="bi x0 y0 w1 h1" src="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/bg1.jpg"/><div class="c x0 y1 w2 h0"><div class="t m0 x1 h2 y2 ff1 fs0 fc0 sc0 ls0 ws0"> </div><div class="t m0 x2 h3 y3 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 x3 h4 y4 ff3 fs2 fc0 sc1 ls0 ws0">房屋价格预测<span class="ff4"> </span></div><div class="t m0 x4 h5 y5 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x1 h6 y6 ff5 fs4 fc0 sc0 ls0 ws0">摘要<span class="ff2"> </span></div><div class="t m0 x5 h5 y7 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h7 y8 ff5 fs3 fc0 sc0 ls0 ws0">购买房屋已成为当前社会的热门话题。<span class="_ _0"></span>为了以更实惠的价格购买到心仪的房屋,<span class="_ _0"></span>了</div><div class="t m0 x6 h7 y9 ff5 fs3 fc0 sc0 ls0 ws0">解房地产市场的变化是非常必要的。<span class="_ _0"></span>因此,<span class="_ _0"></span>本文旨在探讨房屋价格影响指标的数据分析</div><div class="t m0 x6 h5 ya ff5 fs3 fc0 sc0 ls0 ws0">和预测的相关问题。<span class="ff2"> </span></div><div class="t m0 x5 h5 yb ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h7 yc ff5 fs3 fc0 sc1 ls0 ws0">针对问题一:<span class="_ _1"></span><span class="sc0">根据时事的变化和文献的翻阅,<span class="_ _0"></span>查找影响房屋的销量和价格的特征指</span></div><div class="t m0 x6 h7 yd ff5 fs3 fc0 sc0 ls0 ws0">标,解读和探讨每个指<span class="_ _2"></span>标的含义,从初步理论<span class="_ _2"></span>和常识中去判断影响房<span class="_ _2"></span>屋总价格的指标,</div><div class="t m0 x6 h5 ye ff5 fs3 fc0 sc0 ls0 ws0">还得出不同的房屋数据影响价格的指标也有略微不同,需根据数据分析而确定。<span class="ff2"> </span></div><div class="t m0 x5 h5 yf ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h7 y10 ff5 fs3 fc0 sc1 ls0 ws0">针对问题二:<span class="_ _3"></span><span class="sc0">基于问题一对指标的理解和探讨,<span class="_ _4"></span>对数据进行分析。<span class="_ _4"></span>在数据里存在错</span></div><div class="t m0 x6 h7 y11 ff5 fs3 fc0 sc0 ls0 ws0">位、<span class="_ _5"></span>缺失值、<span class="_ _5"></span>异常值等问题,<span class="_ _5"></span>通过移位、<span class="_ _5"></span>填充、<span class="_ _5"></span>删除等恰当方法对数<span class="_ _2"></span>据进行处理,<span class="_ _5"></span>同样</div><div class="t m0 x6 h7 y12 ff5 fs3 fc0 sc0 ls0 ws0">删除了一些没用分析价值的指标。<span class="_ _6"></span>对余下的指标逐一进行多种可视化分析和特征处理操</div><div class="t m0 x6 h7 y13 ff5 fs3 fc0 sc0 ls0 ws0">作,<span class="_ _6"></span>从各指标值的分布情况和对房屋总价格的影响来判断各指标是否影响房屋的总价格,</div><div class="t m0 x6 h7 y14 ff5 fs3 fc0 sc0 ls0 ws0">对影响指标值为标量进行哑变量处理。<span class="sc1">最后得出<span class="_ _2"></span></span>:有<span class="_ _7"> </span><span class="ff6">13<span class="_ _7"> </span></span>个特征指标对房屋价格产生影</div><div class="t m0 x6 h5 y15 ff5 fs3 fc0 sc0 ls0 ws0">响,有<span class="_ _8"> </span><span class="ff6">10<span class="_"> </span></span>个指标是无用指标或者没用影响的指标。<span class="ff2"> </span></div><div class="t m0 x5 h5 y16 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x7 h8 y17 ff5 fs3 fc0 sc1 ls0 ws0">针对问题三:<span class="_ _2"></span><span class="sc0">基于问题二得出的<span class="_ _9"> </span><span class="ff6">13<span class="_ _9"> </span></span>个特征指标,对其分别命名为<span class="ff7">𝑥</span></span></div><div class="t m0 x8 h9 y18 ff7 fs5 fc0 sc0 ls0 ws0">1</div><div class="t m0 x9 h8 y17 ff7 fs3 fc0 sc0 ls0 ws0">~𝑥</div><div class="t m0 xa h9 y18 ff7 fs5 fc0 sc0 ls1 ws0">14</div><div class="t m0 xb h7 y17 ff5 fs3 fc0 sc0 ls0 ws0">,房屋总</div><div class="t m0 x6 h8 y19 ff5 fs3 fc0 sc0 ls0 ws0">价格为<span class="ff7">𝑦</span>。建立起<span class="_ _a"> </span><span class="ff6">GSRF<span class="_ _a"> </span></span>回归预测模型,网格搜索随机森林回归(<span class="ff6">Grid-Search <span class="_ _b"> </span>Random </span></div><div class="t m0 x6 h7 y1a ff6 fs3 fc0 sc0 ls0 ws0">Forest<span class="ff5">,<span class="_ _c"></span><span class="ff6">GSRF<span class="ff5">)<span class="_ _c"></span>算法是一种改进的随机森林算法。<span class="_ _c"></span>对<span class="_ _c"></span>“<span class="ff6">max_depth</span>”<span class="_ _c"></span>、<span class="_ _d"></span>“<span class="ff6">min_samples_split</span>”</span></span></span></div><div class="t m0 x6 h7 y1b ff5 fs3 fc0 sc0 ls2 ws0">和“<span class="_ _b"> </span><span class="ff6 ls0">n_estimators<span class="ff5">”<span class="_ _3"></span>三个参数选择最优组合值,<span class="_ _4"></span>分别为<span class="_ _e"> </span><span class="ff6 ls3">11<span class="_ _2"></span></span>,<span class="_ _3"></span><span class="ff6">20<span class="ff5">,<span class="_ _3"></span><span class="ff6">82<span class="ff5">,<span class="_ _f"></span>其最佳模型参数的评</span></span></span></span></span></span></div><div class="t m0 x6 h7 y1c ff5 fs3 fc0 sc0 ls0 ws0">分为<span class="_ _8"> </span><span class="ff6">0.775</span>。<span class="sc1">最后<span class="_ _2"></span>得出</span>:测试集预测效果的决定系数为<span class="_ _e"> </span><span class="ff6">0.782</span>,<span class="ff6">MAE<span class="_ _e"> </span></span>值为<span class="_ _8"> </span><span class="ff6">83.509</span>,<span class="ff6">MS<span class="_ _2"></span>E</span></div><div class="t m0 x6 h5 y1d ff5 fs3 fc0 sc0 ls0 ws0">值为<span class="_ _8"> </span><span class="ff6">138.676</span>,表明了有良好的回归预测效性能,最后得出回归方程。<span class="ff2"> </span></div><div class="t m0 x5 h5 y1e ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h5 y1f ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h5 y20 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x6 h5 y21 ff5 fs3 fc0 sc1 ls0 ws0">关键词<span class="sc0">:房屋价格、数据分析、特征处理、<span class="ff6">GSRF<span class="_ _e"> </span></span>模型<span class="ff2"> </span></span></div><div class="t m0 x5 h5 y22 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x5 h5 y23 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x6 h5 y24 ff2 fs3 fc0 sc0 ls0 ws0"> <span class="_ _10"> </span> </div></div></div><div class="pi" data-data='{"ctm":[1.611792,0.000000,0.000000,1.611792,0.000000,0.000000]}'></div></div><div id="pf2" class="pf w0 h0" data-page-no="2"><div class="pc pc2 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/bg2.jpg"><div class="c x0 y1 w2 h0"><div class="t m0 xc h3 y25 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 xc h3 y3 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 xd h4 y26 ff3 fs2 fc1 sc0 ls0 ws0">目录<span class="ff4"> </span></div><div class="t m0 x6 h5 y27 ff6 fs3 fc0 sc0 ls0 ws0">1 <span class="_"> </span><span class="ff8">问题重述<span class="ff2">................................................................................................................................ <span class="_ _1"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y28 ff6 fs0 fc0 sc0 ls0 ws0">1.1 <span class="_"> </span><span class="ff8">问题背景<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>..................................................................... <span class="_ _4"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y29 ff6 fs0 fc0 sc0 ls0 ws0">1.2 <span class="_"> </span><span class="ff8">问题提出<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>..................................................................... <span class="_ _4"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y2a ff6 fs3 fc0 sc0 ls0 ws0">2 <span class="_"> </span><span class="ff8">问题分析<span class="ff2">................................................................................................................................ <span class="_ _1"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y2b ff6 fs0 fc0 sc0 ls0 ws0">2.1 <span class="_"> </span><span class="ff8">问题一分析<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>................................................................. <span class="_ _4"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y2c ff6 fs0 fc0 sc0 ls0 ws0">2.2 <span class="_"> </span><span class="ff8">问题二分析<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>................................................................. <span class="_ _4"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y2d ff6 fs0 fc0 sc0 ls0 ws0">2.3 <span class="_"> </span><span class="ff8">问题三分析<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>................................................................. <span class="_ _4"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y2e ff6 fs3 fc0 sc0 ls0 ws0">3 <span class="_"> </span><span class="ff8">问题假设<span class="ff2">................................................................................................................................ <span class="_ _1"></span><span class="ff6">1<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y2f ff6 fs3 fc0 sc0 ls0 ws0">4 <span class="_"> </span><span class="ff8">问题一指标解读与探讨<span class="ff2">........................................................................................................ <span class="_ _1"></span><span class="ff6">2<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y30 ff6 fs3 fc0 sc0 ls0 ws0">5 <span class="_"> </span><span class="ff8">问题二数据分析与处理<span class="ff2">........................................................................................................ <span class="_ _1"></span><span class="ff6">3<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y31 ff6 fs0 fc0 sc0 ls0 ws0">5.1 <span class="_"> </span><span class="ff8">数据预处理<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>................................................................. <span class="_ _4"></span><span class="ff6">3<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y32 ff6 fs0 fc0 sc0 ls0 ws0">5.2 <span class="_"> </span><span class="ff8">所在区域<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>..................................................................... <span class="_ _4"></span><span class="ff6">5<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y33 ff6 fs0 fc0 sc0 ls0 ws0">5.3 <span class="_"> </span><span class="ff8">建筑面积<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>..................................................................... <span class="_ _4"></span><span class="ff6">6<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y34 ff6 fs0 fc0 sc0 ls0 ws0">5.4 <span class="_"> </span><span class="ff8">房屋朝向<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>..................................................................... <span class="_ _4"></span><span class="ff6">7<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y35 ff2 fs0 fc0 sc0 ls0 ws0">5.5<span class="ff6"> <span class="_"> </span><span class="ff8">所在楼层、楼层数<span class="_ _11"></span><span class="ff2"> <span class="_ _5"></span>..................................................................................................................... <span class="_ _f"></span><span class="ff6">8<span class="ff9 fs6"> </span></span></span></span></span></div><div class="t m0 xe ha y36 ff6 fs0 fc0 sc0 ls0 ws0">5.6 <span class="_"> </span><span class="ff8">梯户比例、产权所<span class="_ _11"></span>属<span class="ff2"> <span class="_ _5"></span>................................................................................................................. <span class="_ _f"></span><span class="ff6">9<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y37 ff6 fs0 fc0 sc0 ls0 ws0">5.7 <span class="_"> </span><span class="ff8">房屋户型<span class="ff2"> <span class="_ _3"></span>................................................................<span class="_ _2"></span>................................................................... <span class="_ _4"></span><span class="ff6">10<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y38 ff6 fs0 fc0 sc0 ls0 ws0">5.8 <span class="_"> </span><span class="ff8">建筑、结构、装修<span class="_ _11"></span><span class="ff2"> <span class="_ _5"></span>................................................................................................................... <span class="_ _5"></span><span class="ff6 ls4">11<span class="ff9 fs6 ls0"> </span></span></span></span></div><div class="t m0 xe ha y39 ff6 fs0 fc0 sc0 ls0 ws0">5.9 <span class="_"> </span><span class="ff8">房屋用途、房屋年<span class="_ _11"></span>限<span class="ff2"> <span class="_ _5"></span>............................................................................................................... <span class="_ _f"></span><span class="ff6">12<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y3a ff6 fs0 fc0 sc0 ls0 ws0">5.10 <span class="_"> </span><span class="ff8">交易权属、抵押信<span class="_ _11"></span>息<span class="ff2"> <span class="_ _5"></span>............................................................................................................. <span class="_ _f"></span><span class="ff6">12<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y3b ff6 fs0 fc0 sc0 ls0 ws0">5.1<span class="_ _11"></span>1 <span class="_"> </span><span class="ff8">总结归纳<span class="_ _11"></span><span class="ff2"> <span class="_ _11"></span>................................................................................................................................. <span class="_ _f"></span><span class="ff6">13<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y3c ff6 fs3 fc0 sc0 ls0 ws0">6 <span class="_"> </span><span class="ff8">问题三模型建立与求解<span class="ff2">...................................................................................................... <span class="_ _1"></span><span class="ff6">14<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y3d ff6 fs0 fc0 sc0 ls0 ws0">6.1 <span class="_"> </span><span class="ff8">改进版随机森林(</span>G<span class="_ _11"></span>SRF<span class="ff8">)<span class="ff2"> <span class="_ _5"></span>..................................................................................................... <span class="_ _f"></span><span class="ff6">14<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y3e ff6 fs0 fc0 sc0 ls0 ws0">6.2 GSRF<span class="_"> </span><span class="ff8">的建立<span class="ff2"> <span class="_ _3"></span>................................................................................................<span class="_ _2"></span>............................. <span class="_ _4"></span><span class="ff6">14<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 xe ha y3f ff6 fs0 fc0 sc0 ls0 ws0">6.3 GSRF<span class="_"> </span><span class="ff8">的评估<span class="_ _11"></span>分析<span class="ff2"> <span class="_ _5"></span>..................................................................................................................... <span class="_ _f"></span><span class="ff6">17<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y40 ff6 fs3 fc0 sc0 ls0 ws0">7 <span class="_"> </span><span class="ff8">模型优缺点<span class="ff2">.......................................................................................................................... <span class="_ _1"></span><span class="ff6">20<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y41 ff6 fs3 fc0 sc0 ls0 ws0">8 <span class="_"> </span><span class="ff8">参考文献<span class="ff2">.............................................................................................................................. <span class="_ _1"></span><span class="ff6">20<span class="ff9 fs6"> </span></span></span></span></div><div class="t m0 x6 h5 y42 ff8 fs3 fc0 sc0 ls0 ws0">附录<span class="ff2">.......................................................................................................................................... <span class="_ _1"></span><span class="ff6">20<span class="ff9 fs7"> </span></span></span></div></div><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a><a class="l"><div class="d m1"></div></a></div><div class="pi" data-data='{"ctm":[1.611792,0.000000,0.000000,1.611792,0.000000,0.000000]}'></div></div><div id="pf3" class="pf w0 h0" data-page-no="3"><div class="pc pc3 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/bg3.jpg"><div class="c x0 y1 w2 h0"><div class="t m0 xc h3 y25 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 xf h3 y3 ff2 fs1 fc0 sc0 ls0 ws0">1 </div><div class="t m0 x10 hb y43 ffa fs2 fc0 sc0 ls0 ws0">1<span class="ffb"> <span class="ffc sc1 ls5">问题重述</span></span> </div><div class="t m0 x6 hc y44 ffa fs4 fc0 sc0 ls0 ws0">1.1<span class="ffb"> <span class="ffc sc1">问题背景</span></span> </div><div class="t m0 x5 h7 y45 ff5 fs3 fc0 sc0 ls0 ws0">购买房屋已成为当前社会的热门话题。<span class="_ _0"></span>许多人认为房屋是必需品,<span class="_ _0"></span>购买房屋是奋斗</div><div class="t m0 x6 h7 y46 ff5 fs3 fc0 sc0 ls0 ws0">的动力之一。<span class="_ _4"></span>然而,<span class="_ _f"></span>对于大多数人来说,<span class="_ _f"></span>拥有一套属于自己的房屋并不是一件简单的事</div><div class="t m0 x6 h7 y47 ff5 fs3 fc0 sc0 ls0 ws0">情。<span class="_ _4"></span>为了以更实惠的价格购买到心仪的房屋,<span class="_ _f"></span>了解房地产市场的变化是非常必要的。<span class="_ _f"></span>影</div><div class="t m0 x6 h7 y48 ff5 fs3 fc0 sc0 ls0 ws0">响房屋价格的因素较多,<span class="_ _f"></span>如果能够预测房价信息,<span class="_ _f"></span>购买者将得到更多的参考信息,<span class="_ _4"></span>从而</div><div class="t m0 x6 h7 y49 ff5 fs3 fc0 sc0 ls0 ws0">购买到性价比更高的房屋,<span class="_ _4"></span>这将具有非常重要的实际价值。<span class="_ _f"></span>因此,<span class="_ _f"></span>本文旨在探讨房屋价</div><div class="t m0 x6 h5 y4a ff5 fs3 fc0 sc0 ls0 ws0">格预测的相关问题。<span class="ff2"> </span></div><div class="t m0 x6 hc y4b ffa fs4 fc0 sc0 ls0 ws0">1.2<span class="ffb"> <span class="ffc sc1">问题提出</span></span> </div><div class="t m0 x7 h5 y4c ff5 fs3 fc0 sc0 ls0 ws0">通过问题陈述中得到的背景信息和条件,我们需要解决以下问题:<span class="ff2"> </span></div><div class="t m0 x7 hd y4d ffd fs3 fc0 sc0 ls0 ws0">➢<span class="ffe"> <span class="ff5">问题一:结合时事、经济等情况,探讨影响房屋销售的因素指标;<span class="fff"> </span></span></span></div><div class="t m0 x7 hd y4e ffd fs3 fc0 sc0 ls0 ws0">➢<span class="ffe"> <span class="ff5">问题二:在问题一探讨结论下,分析影响房价的特征指标和原因;<span class="fff"> </span></span></span></div><div class="t m0 x7 hd y4f ffd fs3 fc0 sc0 ls0 ws0">➢<span class="ffe"> <span class="ff5">问题三:建立房价预测模型,并且对模型进行分析。<span class="fff"> </span></span></span></div><div class="t m0 x10 hb y50 ffa fs2 fc0 sc0 ls0 ws0">2<span class="ffb"> <span class="ffc sc1 ls5">问题分析</span></span> </div><div class="t m0 x6 hc y51 ffa fs4 fc0 sc0 ls0 ws0">2.1<span class="ffb"> <span class="ffc sc1">问题一分析</span></span> </div><div class="t m0 x7 h7 y52 ff5 fs3 fc0 sc0 ls0 ws0">通过时事变化和资料翻阅,<span class="_ _5"></span>查找出影响房屋销售和价格的指标,<span class="_ _5"></span>并且对这些指标进</div><div class="t m0 x6 h7 y53 ff5 fs3 fc0 sc0 ls0 ws0">行解读和探讨,<span class="_ _0"></span>大部分销售和房价的影响都是类似的,<span class="_ _0"></span>但存在部分影响指标要根据数据</div><div class="t m0 x6 h5 y54 ff5 fs3 fc0 sc0 ls0 ws0">进行分析而得出的。<span class="ff2"> </span></div><div class="t m0 x6 hc y55 ffa fs4 fc0 sc0 ls0 ws0">2.2<span class="ffb"> <span class="ffc sc1">问题二分析</span></span> </div><div class="t m0 x5 h7 y56 ff5 fs3 fc0 sc0 ls0 ws0">基于问题一的探讨,<span class="_ _5"></span>对<span class="_ _8"> </span><span class="ff6">house<span class="_"> </span></span>文件数据进行预处理操作,<span class="_ _5"></span>将缺失值、<span class="_ _5"></span>异常值以及错</div><div class="t m0 x6 h7 y57 ff5 fs3 fc0 sc0 ls0 ws0">位等数据进行恰当的处理。<span class="_ _6"></span>后对每个指标进行数量分布以及与房屋总价格关系的可视化,</div><div class="t m0 x6 h7 y58 ff5 fs3 fc0 sc0 ls0 ws0">从而找出影响房屋销量和总价格的指标,<span class="_ _0"></span>对影响指标的值为标量进行哑变量操作,<span class="_ _0"></span>从而</div><div class="t m0 x6 h5 y59 ff5 fs3 fc0 sc0 ls0 ws0">对影响指标特征进行归纳总结。<span class="ff2"> </span></div><div class="t m0 x6 hc y40 ffa fs4 fc0 sc0 ls0 ws0">2.3<span class="ffb"> <span class="ffc sc1">问题三分析</span></span> </div><div class="t m0 x7 h7 y5a ff5 fs3 fc0 sc0 ls5 ws0">基于问题二<span class="ls0">得出的<span class="_ _2"></span>影响<span class="_ _2"></span>房屋总<span class="_ _2"></span>价格的<span class="_ _2"></span>特征指<span class="_ _2"></span>标,<span class="_ _2"></span>选择了<span class="_ _2"></span>网格搜<span class="_ _2"></span>索随机<span class="_ _2"></span>森林<span class="_ _2"></span>(<span class="_ _12"></span><span class="fff">GSRE</span>)</span></div><div class="t m0 x6 h7 y5b ff5 fs3 fc0 sc0 ls0 ws0">算法进行对房屋总价格进行预测。<span class="_ _13"></span>对模型进行参数优化选择,<span class="_ _13"></span>从而得出较好的预测效果。<span class="_ _14"></span><span class="fff"> </span></div><div class="t m0 x10 hb y5c ffa fs2 fc0 sc0 ls0 ws0">3<span class="ffb"> <span class="ffc sc1 ls5">问题假设</span></span> </div><div class="t m0 x6 h5 y5d ff5 fs3 fc0 sc1 ls0 ws0">假设一:<span class="_ _2"></span><span class="sc0">数据真实有效可靠;<span class="ff2"> </span></span></div><div class="t m0 x6 h5 y5e ff5 fs3 fc0 sc1 ls0 ws0">假设二:<span class="_ _2"></span><span class="sc0">别墅的房屋没有设立电梯口,即为零梯。<span class="ff2"> </span></span></div></div></div><div class="pi" data-data='{"ctm":[1.611792,0.000000,0.000000,1.611792,0.000000,0.000000]}'></div></div><div id="pf4" class="pf w0 h0" data-page-no="4"><div class="pc pc4 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/bg4.jpg"><div class="c x0 y1 w2 h0"><div class="t m0 xc h3 y25 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 xf h3 y3 ff2 fs1 fc0 sc0 ls0 ws0">2 </div><div class="t m0 x11 hb y5f ffa fs2 fc0 sc0 ls0 ws0">4<span class="ffb"> <span class="ffc sc1 ls5">问题一指标解读与探讨</span></span> </div><div class="t m0 x7 h7 y60 ff5 fs3 fc0 sc0 ls0 ws0">根据现在时事环境变化和文献参考,<span class="_ _5"></span>查找到了影响房屋总价格和房屋销量因素,<span class="_ _5"></span>也</div><div class="t m0 x6 h7 y61 ff5 fs3 fc0 sc0 ls0 ws0">对这些因素指标进行理解和探讨。以下是对部分影响指标进行解读</div><div class="t m0 x12 he y62 fff fs8 fc0 sc0 ls0 ws0">[1][2]</div><div class="t m0 x13 h7 y61 ff5 fs3 fc0 sc0 ls0 ws0">:<span class="fff"> </span></div><div class="t m0 x7 h5 y63 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x7 h7 y64 ff5 fs3 fc2 sc2 ls0 ws0">区域位置:<span class="_ _5"></span><span class="sc0">区域的位置对房屋的销量和价格是重要影响因数之一。<span class="_ _5"></span>区域对房屋销量</span></div><div class="t m0 x6 h7 y65 ff5 fs3 fc2 sc0 ls0 ws0">和价格影响主要体现在该区域的经济状况、<span class="_ _4"></span>交通便利、<span class="_ _f"></span>就业与工资等。<span class="_ _f"></span>如今社会的科技</div><div class="t m0 x6 h7 y66 ff5 fs3 fc2 sc0 ls0 ws0">发展迅速,<span class="_ _4"></span>交通也越来越先进,<span class="_ _f"></span>而就业竞争也越来越大。<span class="_ _f"></span>很多人<span class="_ _2"></span>年轻人都想着在大城市</div><div class="t m0 x6 h7 y67 ff5 fs3 fc2 sc0 ls0 ws0">发展和就业,<span class="_ _13"></span>因为相对来说大城市的就业选择的机会多,<span class="_ _14"></span>且医疗先进、<span class="_ _13"></span>交通便利等因素,</div><div class="t m0 x6 h7 y68 ff5 fs3 fc2 sc0 ls0 ws0">但同样的大城市的房屋价格也相对来说比较高。<span class="fff"> </span></div><div class="t m0 x7 h7 y69 ff5 fs3 fc3 sc3 ls0 ws0">房屋<span class="_ _2"></span>朝向<span class="_ _2"></span>:<span class="_ _2"></span><span class="sc0 ls5">房屋<span class="ls0">朝向的探<span class="_ _2"></span>讨有很<span class="_ _2"></span>多,在<span class="_ _2"></span>古代都<span class="_ _2"></span>是讲究<span class="_ _2"></span>着“坐<span class="_ _2"></span>北朝南<span class="_ _2"></span>”的房<span class="_ _2"></span>屋朝向<span class="_ _2"></span>,</span></span></div><div class="t m0 x6 h7 y6a ff5 fs3 fc3 sc0 ls0 ws0">而现在有些建筑是根据客厅和主卧室的窗户来确定的房屋的朝向。<span class="_ _6"></span>购房者通常会考虑到</div><div class="t m0 x6 h7 y6b ff5 fs3 fc3 sc0 ls0 ws0">房屋的通风和采光程度,<span class="_ _4"></span>部分人可能还会考虑到风水的问题。<span class="_ _f"></span>因此,<span class="_ _f"></span>房屋朝向对房屋的</div><div class="t m0 x6 h7 y6c ff5 fs3 fc3 sc0 ls0 ws0">销量和价格有一定的影响。<span class="ff10"> </span></div><div class="t m0 x7 h7 y6d ff5 fs3 fc2 sc2 ls0 ws0">所在楼层:<span class="sc0">楼层的高低<span class="_ _11"></span>各有各自的好坏。<span class="_ _11"></span>高楼层可以看到更好的视野,<span class="_ _5"></span>还可以体验</span></div><div class="t m0 x6 h7 y6e ff5 fs3 fc2 sc0 ls0 ws0">到更好的采光和照射,<span class="_ _0"></span>但可能存在一定的安全系数风险等;<span class="_ _0"></span>而低楼层可以更好的方便进</div><div class="t m0 x6 h7 y6f ff5 fs3 fc2 sc0 ls0 ws0">出家门,<span class="_ _4"></span>但可能会受到周围环境因素的影响<span class="_ _2"></span>以及防盗问题;<span class="_ _4"></span>中楼层夹杂在两者之间<span class="_ _2"></span><span class="ls6">,相</span></div><div class="t m0 x6 h7 y70 ff5 fs3 fc2 sc0 ls0 ws0">对来说比较多人选择的。<span class="fff"> </span></div><div class="t m0 x7 h7 y71 ff5 fs3 fc2 sc2 ls0 ws0">梯户比例:<span class="_ _5"></span><span class="sc0">梯户<span class="_ _2"></span>比例是指单元楼电梯数和每层楼住户数的比例。<span class="_ _3"></span>梯户比例的值可以</span></div><div class="t m0 x6 h7 y72 ff5 fs3 fc2 sc0 ls0 ws0">直接反映出该楼的人口<span class="_ _2"></span>密度人数和房屋户型结<span class="_ _2"></span>构。梯户比例越<span class="_ _2"></span>低,该层楼的人<span class="_ _2"></span>口数多,</div><div class="t m0 x6 h7 y73 ff5 fs3 fc2 sc0 ls0 ws0">高峰期时刻人们等电梯的时间会变长;<span class="_ _0"></span>楼层的租户数多反映了单套面积就小;<span class="_ _0"></span>而梯户比</div><div class="t m0 x6 h7 y74 ff5 fs3 fc2 sc0 ls0 ws0">例高,<span class="_ _13"></span>降低了社区的容积率,<span class="_ _14"></span>而且房价也相对昂贵;<span class="_ _13"></span>因此通常来说梯户比例较中等为好。<span class="fff sc2"> </span></div><div class="t m0 x7 h7 y75 ff5 fs3 fc2 sc2 ls0 ws0">房屋户型:<span class="sc0">房屋户型主要包括四部分:<span class="_ _11"></span>卧室、<span class="_ _11"></span>客厅、<span class="_ _11"></span>厨房、<span class="_ _11"></span>卫生间。<span class="_ _11"></span>该指标相对来</span></div><div class="t m0 x6 h7 y76 ff5 fs3 fc2 sc0 ls0 ws0">说也是影响房屋销量和价格的主要因素之一。<span class="_ _f"></span>房屋户型的设计、<span class="_ _f"></span>功能分区、<span class="_ _f"></span>空间位置以</div><div class="t m0 x6 h7 y77 ff5 fs3 fc2 sc0 ls0 ws0">及空间利用等都是购房者的考虑因素。<span class="fff"> </span></div><div class="t m0 x7 h7 y78 ff5 fs3 fc2 sc2 ls0 ws0">房屋用途:<span class="sc0">房屋用途体<span class="_ _11"></span>现房屋价值的一个重要因素。<span class="_ _11"></span>房屋的用途有很多种,<span class="_ _5"></span>但通常</span></div><div class="t m0 x6 h7 y79 ff5 fs3 fc2 sc0 ls0 ws0">以住宅、<span class="_ _5"></span>商住两用、<span class="_ _3"></span>别墅三者为主。<span class="_ _5"></span>相对来说,<span class="_ _5"></span>别墅的单位面积最贵,<span class="_ _5"></span>商住两用的单位</div><div class="t m0 x6 h7 y7a ff5 fs3 fc2 sc0 ls0 ws0">面积最便宜,而人们大部分买房通常是以住宅为主,销量基本上以住宅为主。<span class="fff"> </span></div><div class="t m0 x7 h7 y7b ff5 fs3 fc2 sc2 ls0 ws0">装修<span class="_ _2"></span>情况<span class="_ _2"></span>:<span class="_ _2"></span><span class="sc0">装修情<span class="_ _2"></span>况就对买<span class="_ _2"></span>的房屋<span class="_ _2"></span>的原始<span class="_ _2"></span>状态,<span class="_ _2"></span>决定了<span class="_ _2"></span>购房者<span class="_ _2"></span>买房后<span class="_ _2"></span>的工作<span class="_ _2"></span>安排<span class="_ _12"></span>。</span></div><div class="t m0 x6 h7 y7c ff5 fs3 fc2 sc0 ls0 ws0">精装的装修情况相对简装和毛坯有很大的跨越,<span class="_ _0"></span>大部分人都喜欢此类装修;<span class="_ _0"></span>部分人都有</div><div class="t m0 x6 h7 y7d ff5 fs3 fc2 sc0 ls0 ws0">自己的装修设计,可以选择简装,可以根据自己的想法去设计房屋。<span class="fff"> </span></div><div class="t m0 x7 h7 y7e ff5 fs3 fc2 sc2 ls0 ws0">配备电梯:<span class="sc0">现在很多房<span class="_ _11"></span>屋都配带有电梯的,<span class="_ _5"></span>在居住在高楼层的住户更加需要。<span class="_ _11"></span>电梯</span></div><div class="t m0 x6 h7 y7f ff5 fs3 fc2 sc0 ls0 ws0">为住户提供高效便利的出行条件。配备电梯的房屋会更受购房者欢迎。<span class="fff"> </span></div><div class="t m0 x7 h7 y80 fff fs3 fc2 sc0 ls0 ws0"> </div><div class="t m0 x7 h7 y81 fff fs3 fc2 sc2 ls0 ws0"> </div><div class="t m0 x7 h7 y82 ff5 fs3 fc2 sc0 ls5 ws0">以上是对部分影响房屋销量和价格的指标进行解读,明白了各指标的原理和意思。</div><div class="t m0 x6 h7 y83 ff5 fs3 fc2 sc0 ls0 ws0">其实影响房屋销量和价格的因素还有很多,<span class="_ _0"></span>但不同的区域数据,<span class="_ _0"></span>影响的因素指标也有不</div><div class="t m0 x6 h7 y84 ff5 fs3 fc2 sc0 ls0 ws0">同。因此,要通过对数观察分析才能更好的确定影响因素指标。<span class="fff"> </span></div><div class="t m0 x7 h7 y85 fff fs3 fc2 sc2 ls0 ws0"> </div><div class="t m0 x6 hf y86 ff2 fs3 fc0 sc0 ls0 ws0"> <span class="_ _10"> </span><span class="ffa fs2"> </span></div></div></div><div class="pi" data-data='{"ctm":[1.611792,0.000000,0.000000,1.611792,0.000000,0.000000]}'></div></div><div id="pf5" class="pf w0 h0" data-page-no="5"><div class="pc pc5 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="/image.php?url=https://csdnimg.cn/release/download_crawler_static/89797149/bg5.jpg"><div class="c x0 y1 w2 h0"><div class="t m0 xc h3 y25 ff2 fs1 fc0 sc0 ls0 ws0"> </div><div class="t m0 xf h3 y3 ff2 fs1 fc0 sc0 ls0 ws0">3 </div><div class="t m0 x11 hb y5f ffa fs2 fc0 sc0 ls0 ws0">5<span class="ffb"> <span class="ffc sc1 ls5">问题二数据分析与处理</span></span> </div><div class="t m0 x5 h7 y60 ff5 fs3 fc0 sc0 ls0 ws0">基于问题一的解读和可能影响价格指标的因素,<span class="_ _6"></span>现对数据进行预处理和可视化分析,</div><div class="t m0 x6 h5 y61 ff5 fs3 fc0 sc0 ls0 ws0">从而得出影响数据里房屋销量和总价格指标。<span class="ff2"> </span></div><div class="t m0 x6 hc y87 ffa fs4 fc0 sc0 ls0 ws0">5.1<span class="ffb"> <span class="ffc sc1">数据预处理</span></span> </div><div class="t m0 x5 h7 y88 ff5 fs3 fc0 sc1 ls0 ws0">预处理一:<span class="_ _5"></span>错位值<span class="_ _2"></span>处理。<span class="_ _5"></span><span class="sc0">针对文件里的数据,<span class="_ _3"></span>发现房屋用途为<span class="_ _3"></span>“别墅”<span class="_ _5"></span>的数据存在</span></div><div class="t m0 x6 h5 y89 ff5 fs3 fc0 sc0 ls0 ws0">部分特征指标的数值发生了错位现象,通过移位方法将值平移到对应的特征指标下。<span class="ff2"> </span></div><div class="t m0 x6 h5 y8a ff5 fs3 fc0 sc0 ls0 ws0">在处理好错位后发现“别墅”对应的户型结构全部缺失,将其填充为“其他”。<span class="ff2"> </span></div><div class="t m0 x5 h7 y8b ff5 fs3 fc0 sc1 ls0 ws0">预处理二:<span class="_ _5"></span>新增<span class="_ _2"></span>指标列。<span class="_ _11"></span><span class="sc0">在处理错位数据时,<span class="_ _3"></span>发现<span class="_ _5"></span>“配备电梯”<span class="_ _5"></span>特征指标里有<span class="_ _5"></span>“集</span></div><div class="t m0 x6 h7 y8c ff5 fs3 fc0 sc0 ls0 ws0">中供暖”<span class="_ _5"></span>、<span class="_ _5"></span>“自供暖”<span class="_ _5"></span>等指标值,<span class="_ _11"></span>这些不属于<span class="_ _5"></span>“配备电梯”<span class="_ _11"></span>特征指标的值,<span class="_ _5"></span>因此将其扩</div><div class="t m0 x6 h7 y8d ff5 fs3 fc0 sc0 ls0 ws0">充一列特征指标为<span class="_ _11"></span>“房屋暖气”<span class="_ _5"></span>,并且将<span class="_ _5"></span>“集中供暖”<span class="_ _11"></span>、<span class="_ _3"></span>“自供暖”<span class="_ _11"></span>这些值移动到<span class="_ _11"></span>“房</div><div class="t m0 x6 h5 y8e ff5 fs3 fc0 sc0 ls0 ws0">屋暖气”指标下。<span class="ff2"> </span></div><div class="t m0 x5 h7 y8f ff5 fs3 fc0 sc1 ls0 ws0">预处理三:<span class="_ _5"></span>无用指标删。<span class="_ _5"></span><span class="sc0">观察文件的指标后,<span class="_ _5"></span>发现<span class="_ _3"></span>“房屋号码”<span class="_ _5"></span>指标值发生了格式</span></div><div class="t m0 x6 h7 y90 ff5 fs3 fc0 sc0 ls0 ws0">错误,<span class="_ _1"></span>无法纠正;<span class="_ _15"></span>“图片”<span class="_ _1"></span>和<span class="_ _4"></span>“链接”<span class="_ _1"></span>都是网页链接<span class="ls7">;“<span class="_ _16"> </span></span>房产权”<span class="_ _1"></span>指标的值只有<span class="_ _4"></span>“<span class="ff6">70<span class="_"> </span></span>年”</div><div class="t m0 x6 h7 y91 ff5 fs3 fc0 sc0 ls0 ws0">一个,这些指标无分析<span class="_ _2"></span>的意义与价值,因此将<span class="_ _2"></span>指标列进行了删除,还<span class="_ _2"></span>有“房本备件”<span class="_ _2"></span>、</div><div class="t m0 x6 h5 y92 ff5 fs3 fc0 sc0 ls0 ws0">“编号”、“挂牌时间”和“上次交易”也是如此。<span class="ff2"> </span></div><div class="t m0 x5 h7 y93 ff5 fs3 fc0 sc1 ls0 ws0">预处理四:<span class="_ _5"></span>缺失<span class="_ _2"></span>值处理。<span class="_ _11"></span><span class="sc0">通过上述的操作后,<span class="_ _3"></span>发现有很多<span class="_ _5"></span>“暂无数据”<span class="_ _5"></span>的值,<span class="_ _5"></span>为了</span></div><div class="t m0 x6 h7 y94 ff5 fs3 fc0 sc0 ls0 ws0">更好的统计缺失值,<span class="_ _3"></span>将表里的<span class="_ _3"></span>“暂无数据”<span class="_ _5"></span>统一替换为空值。<span class="_ _f"></span>本文是关于房屋价格预测</div><div class="t m0 x6 h7 y95 ff5 fs3 fc0 sc0 ls0 ws0">的,<span class="_ _5"></span>所以首先将<span class="_ _3"></span>“房屋总价格”<span class="_ _5"></span>缺失值的数据行进行删除。<span class="_ _5"></span>后经过统计,<span class="_ _5"></span>各指标下的缺</div><div class="t m0 x6 h5 y96 ff5 fs3 fc0 sc0 ls0 ws0">失值如下图<span class="_ _8"> </span><span class="ff6">1</span>。<span class="ff2"> </span></div><div class="t m0 x9 h5 y97 ff2 fs3 fc0 sc0 ls0 ws0"> </div><div class="t m0 x14 h10 y98 ffc fs6 fc0 sc0 ls0 ws0">图<span class="ff4"> <span class="_ _17"> </span>1<span class="_"> </span></span>特征指标的缺失值数<span class="ff4"> </span></div><div class="t m0 x5 h7 y99 ff5 fs3 fc0 sc0 ls0 ws0">缺失值处理①:<span class="_ _5"></span>通过查找发现,<span class="_ _f"></span>“小区名字”<span class="_ _5"></span>和<span class="_ _5"></span>“所在区域”<span class="_ _5"></span>的缺失值都是来自同</div><div class="t m0 x6 h7 y9a ff5 fs3 fc0 sc0 ls0 ws0">一行,<span class="_ _5"></span>通过该行的<span class="_ _11"></span>“房屋主题”<span class="_ _5"></span>可以快速的发现<span class="_ _11"></span>“小区名字”<span class="_ _5"></span>是<span class="_ _11"></span>“禄徽苑”<span class="_ _5"></span>;<span class="_ _11"></span>也通过该</div><div class="t m0 x6 h5 y9b ff5 fs3 fc0 sc0 ls0 ws0">“小区名字”筛选知道该行的“所在区域”是“长丰北城”,将两个值填充上去。<span class="ff2"> </span></div><div class="t m0 x5 h5 y9c ff2 fs3 fc0 sc0 ls0 ws0"> </div></div></div><div class="pi" data-data='{"ctm":[1.611792,0.000000,0.000000,1.611792,0.000000,0.000000]}'></div></div>