Streaming执行Python版Wo

Streaming Python Wo 2023-01-31 07:01:43 451人浏览独家记忆

Python 官方文档：入门教程 => 点击学习

摘要

一：先写map类import sys for line in sys.stdin: line = line.strip( ) Words = line.split( ) for word in words: print('%s\t%s' %

一：先写map类

import sys
for line in sys.stdin:
line = line.strip( )
Words = line.split( )
for word in words:
print('%s\t%s' % (word, 1))

二：写reduce类

import sys
current_word = None
current_count = 0
word = None
for line in sys.stdin:
line = line.strip()
word, count = line.split('\t',1)
try:
count = int(count)
except ValueError:
continue
if current_word == word:
current_count += count
else:
if current_word:
print('%s\t%s' % (current_word,current_count))
current_count = count
current_word = word
if current_word == word:
print('%s\t%s' % (current_word,current_count))

三：利用hadoop Streaming执行python的内容。

hadoop jar /home/hadoop/hadoop-2.6.0-cdh5.5.2/share/hadoop/tools/lib/hadoop-streaming-2.6.0-cdh5.5.2.jar -input /user/hadoop/aa.txt -output /user/hadoop/Python_output -mapper "python mapper.py" -reducer "python reducer.py" -file mapper.py -file reducer.py

说明：

输入和输出路径，本身就是hdfs上的，不需要特殊指定hdfs。

不加×××部分的引号的话，会报错误：

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2

不加粉色部分的内容的话，会报错误：

Error: java.lang.RuntimeException: Error in configuring object

您可能感兴趣的文档:

--结束END--

本文标题: Streaming执行Python版Wo

本文链接: https://lsjlt.com/news/191299.html(转载时请注明来源链接)

有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

猜你喜欢

Streaming执行Python版Wo

一：先写map类import sys for line in sys.stdin: line = line.strip( ) words = line.split( ) for word in words: print('%s\t%s' %...

99+

2023-01-31

Streaming Python Wo
Python执行hive sql

该python脚本是用于执行hive脚本的，需要设置hive的可执行环境变量，其实质转化为shell下命令 hive -e 'sql语句’ 的方式执行，然后把结果重定向到控制台显示。注：由于该脚本是直接调用shell中的hive...

99+

2023-01-31

Python hive sql
Python：执行命令行指令

文章目录简介os.systemos.popensubprocess.Popen()参考文献简介在python中，调用外部命令行（linux中的shell、或者windows中的cmd...

99+

2023-09-30

python
Python并行执行for循环

简介在介绍如何最简单地利用 python 实现并行前，我们先来看一个简单的代码。 words = ['apple', 'bananan', 'cake', 'dumpling'] for word in words:...

99+

2023-01-31

Python
python怎么换行不执行

在 python 中换行不执行代码可以通过以下方式实现：使用反斜杠（\）转义换行符。使用三引号字符串（''' 或 """) 允许多行文本。使用 f-strings（格式化字符串）嵌入多行...

99+

2024-05-14

python
在Python中执行cmd

目录1、使用os.system()方法2、使用os.popen()方法3、使用subprocess.Popen()1、使用os.system()方法 os.system用来执行cmd...

99+

2024-04-02
jenkins执行python脚本

最新在研究使用jenkins做升级发布功能，大概的操作是选择产品、模块、环境等参数后，执行一个python脚本，脚本获取用户选择参数，然后执行发布动作。jenkins执行python脚本，需要使用python plugin具体使用方法：1、...

99+

2023-01-31

脚本 jenkins python
python执行js文件

#!/usr/bin/python # -*- coding: UTF-8 -*- import execjs,os #执行本地自定义的js print execjs.compile('function test(){' ...

99+

2023-01-31

文件 python js
python paramiko 执行命令

参考http://wangwei007.blog.51cto.com/blog/68019/1058726http://youzao.blog.51cto.com/3946111/1348903在node1上想对node2执行命令，标准的答...

99+

2023-01-31

命令 python paramiko
python执行cmd命令

最典型的模块一，执行cmd并读取返回值 import subprocess p = subprocess.Popen("ls", stdout=subprocess.PIPE, universal_newl...

99+

2023-01-31

命令 python cmd
python执行sql文件

最近遇到一对需要执行的sql文件，sql文件内是insert 语句。如下： INSERT INTO hs_his.stock_industry VALUES ('采掘', '000006', '深振业A'); INSERT INT...

99+

2023-01-31

文件 python sql
Celery 3 版本定时执行与异

Celery 是一个基于python开发的分布式异步消息任务队列，通过它可以轻松的实现任务的异步处理，如果你的业务场景中需要用到异步任务，就可以考虑使用celery。软件架构环境 * python3.6.4 * django ...

99+

2023-01-31

版本 Celery
Python执行py文件需要可执行权限吗

这篇文章主要讲解了“Python执行py文件需要可执行权限吗”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着小编的思路慢慢深入，一起来研究和学习“Python执行py文件需要可执行权限吗”吧！案例解析这个问题描述起来有点违反直觉，...

99+

2023-07-05
[python]map方法与并行执行

文章也可参考: 我的个人博客 1. 内建方法map 内建map方法可以通过一个序列的方式来实现函数之间的映射, 并且串行执行。如: import time from datetime import datetime def ad...

99+

2023-01-31

方法 python map
python脚本如何执行

本篇内容主要讲解“python脚本如何执行”，感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷，实用性强。下面就让小编来带大家学习“python脚本如何执行”吧!第一种：使用脚本式编程把下面的代码复制到W3Cschool.py文件中：pr...

99+

2023-06-27
python 执行外部命令

在Python环境下，执行一条外部命令一般有这么几个： 1、os.system(command)或者os.popen(..........) 2、wx.Execute(command, syn=wx.EXEC_ASYNC, callbac...

99+

2023-01-31

命令 python
动态执行python代码

最近刚刚学到两个新的命令exec和eval，这两个命令可以在代码中动态执行python脚本。 exec a = 1 exec "a = 2" print a 上述代码输出的结果为2 name = "os" exec "impor...

99+

2023-01-31

代码动态 python
Python ssh 远程执行shell

python paramiko import paramiko ssh = paramiko.SSHClient() key = paramiko.AutoAddPolicy() ssh.set_missing_host_...

99+

2023-01-31

Python ssh shell
python下执行cmd命令

使用os import os result = os.popen("ipconfig") print (result.read()) 这种方式有返回值，因而使用起来较为方便 ...

99+

2023-01-31

命令 python cmd
python定时执行--每天

以下代码实现了python的每天定时执行： import datetime import time import pymysql def doSth(): # print('test') conn = pymysql...

99+

2023-01-31

python