Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
K
kb
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2
    • Issues 2
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • granite
  • kb
  • Wiki
    • Data_stream
  • environmental_protection_grade

environmental_protection_grade · Changes

Page history
update: 环保等级: 结果字段解析更新 authored Nov 01, 2021 by 蒋家升's avatar 蒋家升
Hide whitespace changes
Inline Side-by-side
Showing with 154 additions and 7 deletions
+154 -7
  • data_stream/environmental_protection_grade.md data_stream/environmental_protection_grade.md +154 -7
  • No files found.
data_stream/environmental_protection_grade.md
View page @ 063c8ddd
...@@ -33,7 +33,9 @@ environmental_protection_grade ...@@ -33,7 +33,9 @@ environmental_protection_grade
广西:http://202.103.233.156:9081/xypjgx/pages/xypj/wzgs/qypjjgList.jsp 广西:http://202.103.233.156:9081/xypjgx/pages/xypj/wzgs/qypjjgList.jsp
河北:http://110.249.223.66:8099/xypjww/xypj/listEntEvaluate 河北:http://110.249.223.66:8099/xypjww/xypj/listEntEvaluate
辽宁:http://221.180.204.224:8080/LiaoNingQiYeXinYongPingJia/display/getEnterpriseInfo.do 辽宁:http://221.180.204.224:8080/LiaoNingQiYeXinYongPingJia/display/getEnterpriseInfo.do
... 山东:http://103.239.155.242:7002/xypjgzd/business/xypj/xypjcontroller/
吉林:http://125.32.96.149:8081/was5/web/search
安徽:http://112.27.211.29:8082/wznrfb/getQueryList
采集文件存放路径: 采集文件存放路径:
/data/gravel_spiders/environmental_protection /data/gravel_spiders/environmental_protection
...@@ -108,35 +110,44 @@ environmental_protection ...@@ -108,35 +110,44 @@ environmental_protection
{"province": "fujian", "step": "start"}, {"province": "fujian", "step": "start"},
{"province": "sichuan", "step": "start"}, {"province": "sichuan", "step": "start"},
{"province": "hunan", "step": "start"}, {"province": "hunan", "step": "start"},
{"province": "henan", "step": "start"}, {"province": "henan", "step": "start", "index": 0, "city": "郑州市"},
{"province": "hubei", "step": "start"}, {"province": "hubei", "step": "start"},
{"province": "guangdong", "step": "start"}, {"province": "guangdong", "step": "start"},
{"province": "guizhou", "step": "start"}, {"province": "guizhou", "step": "start", "index": 0, "year": 2019},
{"province": "guangxi", "step": "start"}, {"province": "guangxi", "step": "start"},
{"province": "hebei"}, {"province": "hebei"},
{"province": "liaoning"} {"province": "liaoning"}
{"province": "shandong", "company": "山东金穗木业股份有限公司"}
{"province": "jilin", "step": "start"}
{"province": "anhui", "step": "start", "index": 0, "year": 2019}
``` ```
### 任务参数说明 ### 任务参数说明
<!--特有参数说明,通用参数比如spider_name,task_params,task_src,task_result等不需说明--> <!--特有参数说明,通用参数比如spider_name,task_params,task_src,task_result等不需说明-->
```json ```json
{'province': 'henan', "step": "start", "index": 0, "city": "郑州市"} {"province": "henan", "step": "start", "index": 0, "city": "郑州市"}
{'province': 'guizhou', "step": "start", "index": 0, "year": 2019} {"province": "guizhou", "step": "start", "index": 0, "year": 2019}
{"province": "shandong", "company": "山东金穗木业股份有限公司"}
{"province": "jilin", "step": "start", "index": 0, "level": "blue"}
``` ```
> + 主要参数 > + 主要参数
> + province: 省份拼音 > + province: 省份拼音
> + index: 翻页的页数(部分不需要)
> + 非必要参数 > + 非必要参数
> + step: 步骤 > + step: 步骤
> + index: 翻页的页数 > + 特殊参数
> + city: 地市,仅河南省有该字段 > + city: 地市,仅河南省有该字段
> + year: 年份,仅贵州省有该字段 > + year: 年份,仅贵州省与安徽省有该字段
> + company: 公司名或统一信用代码,仅山东省有该字段
> + level: 等级,仅吉林省有该字段
## data_type说明 ## data_type说明
<!--可能产生的data_type说明--> <!--可能产生的data_type说明-->
```buildoutcfg ```buildoutcfg
list: 列表页数据 list: 列表页数据
detail:详情页数据,当前仅山东省属于detail
``` ```
## 爬虫结果的超级数据 ## 爬虫结果的超级数据
...@@ -892,6 +903,142 @@ list: 列表页数据 ...@@ -892,6 +903,142 @@ list: 列表页数据
``` ```
> [字段解析](data_stream/environmental_protection_related/field) > [字段解析](data_stream/environmental_protection_related/field)
#### 山东:
```json
{
"data":
[
{
"XH": "20211029230149891b2f3dcb6a41fbba72a2315794219a",
"QYGXJGMC": "",
"DF": 7,
"YSBS": "#ffff33",
"XTXH": "1635519677032004730880",
"QYMC": "利津誉鑫新型建材有限责任公司",
"QYBH": "8efdb9506bea47a9f78017a328a6b981",
"QYDZ": "山东省东营市利津县经济开发区S316路南(利津力能热电对过)",
"TYSHXYDM": "91370522MA3CFETM8A",
"FRDB": "刘彬",
"PJRQ": "2021-10-29",
"DJFLMC": "黄色等级"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-30 16:26:25.305",
"spider_end_time": "2021-10-30 16:26:29",
"task_params": {"province": "shandong","company": "利津誉鑫新型建材有限责任公司"},
"metadata": {"province": "shandong","company": "利津誉鑫新型建材有限责任公司"},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.14"
}
```
#### 吉林:
```json
{
"data":
[
{
"address": "吉林省吉林市永吉县岔路河镇一委", # 企业住址
"capital": "100", # 注册资本(万人民币)
"code": "91220221MA0Y3JCJ8A", # 社会信用代码/工商许可证号
"id": "427",
"lawer": "许笑维", # 法定代表人或负责人
"name": "永吉县仁合供热有限公司", # 企业名称
"score": "3", # 总扣分项
"wasid": "218230",
"level": "blue" # 当前环境信用状况结果,与score关联,蓝标(blue):score >=1 and score <=6;黄标(yellow):score >=7 and score <=11;红标(red):score>=12
},
{
"wasid": "218230",
"id": "596",
"code": "91220112310008964E",
"name": "长春市泓利供热有限公司",
"capital": "500",
"address": "长春市双阳区山河街道泓利港湾小区东侧",
"score": "2",
"lawer": "李建",
"level": "blue"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-01 10:54:59.762",
"spider_end_time": "2021-11-01 10:55:01",
"task_params":
{
"province": "jilin",
"step": "start",
"level": "blue",
"index": 3
},
"metadata":
{
"province": "jilin",
"level": "blue",
"index": 3
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
```
#### 安徽:
```json
{
"data":
[
{
"area_id": "繁昌县", # 地区
"b_id": "9297d507725146d08b7d5f9d8cf0f9c6",
"com_id": "ebc07434060e4670827fd462df69ceeb",
"com_name": "芜湖海螺水泥有限公司", # 企业名称
"cplx": "省级", # 参评类型
"id": "a36e410a3bfc4b0091682f815db93b41",
"public_level": "环保诚信企业", # 年度评定结果
"row_number": 21,
"year": 2016 # 年度
},
{
"area_id": "经济技术开发区",
"b_id": "8d0e5dd6986949d198cf69af39eec955",
"com_id": "5a581012297445e5852afbb2d6a32afc",
"com_name": "安徽楚江科技新材料股份有限公司",
"cplx": "省级",
"id": "15850528a44541c2813782b6dcbcaa0b",
"public_level": "环保诚信企业",
"row_number": 22,
"year": 2016
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-01 13:43:36.944",
"spider_end_time": "2021-11-01 13:43:37",
"task_params":
{
"province": "anhui",
"step": "start",
"year": 2016,
"index": 3
},
"metadata":
{
"province": "anhui",
"year": 2016,
"index": 3
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
```
## 爬虫运行环境 ## 爬虫运行环境
<!--udm模块?scrapy?或其他--> <!--udm模块?scrapy?或其他-->
......
Clone repository
  • README
  • basic_guidelines
  • basic_guidelines
    • basic_guidelines
    • dev_guide
    • project_build
    • 开发流程
  • best_practice
  • best_practice
    • AlterTable
    • RDS
    • azkaban
    • create_table
    • design
    • elasticsearch
    • elasticsearch
      • ES运维
    • logstash
View All Pages