MySQL5.7版本后GROUP BY问题

点亮(0)

点灭(0)

收藏(0)

2021-12-07 11:30:22

1. 起因

最近在做一个开源项目,中间用到了group by查询,在本地测试的时候没有啥问题,但是项目更新到服务器上之后出现了下面的错误

ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'bbs.t_private_message.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

字面上的意思是SELECT里面的列必须包含在GROUP BY当中,于是网上冲浪查询相关的坑,发现有下面两种解决方案。

2. 修改sql_mode

修改global.sql_mode有两种方式,第一种通过SET语句进行修改,如下

select @@global.sql_mode
# ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
set @@global.sql_mode = `STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION`

如果上面的方法没有生效,试着用下面的方法

select @@session.sql_mode
#ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
set @@session.sql_mode = `STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION`

第二种方式是通过修改配置文件,这种方法的可行性有待商榷,我自行修改没有生效,还会导致MySQL服务启动不起来了

# 在文件最后加上下面的内容
[mysqld]
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
# 重启MySQL服务
sudo systemctl restart mysql.service

3. 使用ANY_VALUE

程序猿都比较喜欢折腾,我也是。总感觉MySQL自动开启ON_FULL_GROUP_BY的运行模式有它自己的考虑。因此就去MySQL官网查找相关的文档,在GROUP BY这一节开篇有下面一段话

SQL-92 and earlier does not permit queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are not named in the GROUP BY clause. For example,this query is illegal in standard SQL-92 because the nonaggregated name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment)
  FROM orders AS o, customers AS c
  WHERE o.custid = c.custid
  GROUP BY o.custid;

上面引用的大概意思就是在SQL-92以及更早的版本SELECT、HAVING、ORDER BY中出现的字段名必须包含在GROUP BY当中,上面的查询语句当中包含有name字段,但是name字段没有包含在GROUP BY当中,所以上述SQL语句是非法的在SQL-92后的版本。

继续往下看也看到了上面两种解决方式sql_mode相关介绍,继续往下翻看到了下面的内容

If you know that, for a given data set, each name value in fact uniquely determines the address value, address is effectively functionally dependent on name. To tell MySQL to accept the query, you can use the ANY_VALUE()

意思就是可以使用ANY_VALUE()这个函数来包裹你想要查询的列,但是这个列又可以不包裹在GROUP BY当中,如下面的SQL语句就是合法的

SELECT name, ANY_VALUE(address), MAX(age) FROM t GROUP BY name;

4. 栗子

说一千道一万还不如举个栗子来个实在,这个问题是在我开源项目的私信模块里面发现的,最开始我就是想通过GROUP BY查找到与当前用户相关的私信,通过sender_id来进行分组,私信表结构如下图

下面就通过一个栗子来演示一下

"""
# coding:utf-8
@Time    : 2021/11/23
@Author  : jiangwei
@File    : private_letter.py
@Desc    : private_letter
@email   : qq804022023@gmail.com
@Software: PyCharm
"""
from flask import Flask, render_template, redirect, url_for, flash
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.sql.expression import func

from faker import Faker
import random
import datetime

username = 'jiangwei'
password = '1994124'
database = 'demo'
f = Faker(locale='zh_CN')
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = f'mysql+pymysql://{username}:{password}@localhost/{database}?charset=utf8'
app.config['SECRET_KEY'] = 'sfajn1314jnm14h1'
db = SQLAlchemy(app)


class PrivateMessage(db.Model):
    __tablename__ = 't_private_message'

    id = db.Column(db.INTEGER, primary_key=True, autoincrement=True)
    sender_id = db.Column(db.INTEGER, nullable=False)
    receiver_id = db.Column(db.INTEGER, nullable=False)
    content = db.Column(db.TEXT, nullable=False, default='')
    c_time = db.Column(db.DATETIME, default=datetime.datetime.now)


db.create_all()


@app.route('/')
def index():
    user_id = random.randint(1, 3)
    return render_template('index.html', user_id=user_id)


@app.route('/insert')
def insert():
    for i in range(10):
        pm = PrivateMessage(
            sender_id=random.randint(1, 20),
            receiver_id=random.randint(1, 3),
            content=f.sentence()
        )
        db.session.add(pm)
    db.session.commit()
    flash('插入数据成功!')
    return redirect(url_for('.index'))


@app.route('/insert/<user_id>')
def query(user_id):
    pms = PrivateMessage.query.\
        with_entities(PrivateMessage.sender_id,
                      func.any_value(PrivateMessage.content),
                      func.max(PrivateMessage.c_time)). \
        filter(PrivateMessage.receiver_id == user_id). \
        order_by(func.max(PrivateMessage.c_time).desc()). \
        group_by(PrivateMessage.sender_id).all()
    return render_template('query.html', pms=pms, user_id=user_id)


if __name__ == '__main__':
    app.run(port=5005, debug=True)

上面是一个非常简答的Flask应用,首先我们通过继承db.Model来定义表t_private_message的表结构,之后通过db.create_all来创建表,然后实例化Flask创建一个flask实例,通过该实例创建了三个视图函数indexinsertquery,分别对应首页视图插入数据视图查询数据视图,我们所用到的any_value方法就包含在查询视图中,代码如下

@app.route('/insert/<user_id>')
def query(user_id):
    pms = PrivateMessage.query.\
        with_entities(PrivateMessage.sender_id,
                      func.any_value(PrivateMessage.content),
                      func.max(PrivateMessage.c_time)). \
        filter(PrivateMessage.receiver_id == user_id). \
        order_by(func.max(PrivateMessage.c_time).desc()). \
        group_by(PrivateMessage.sender_id).all()
    return render_template('query.html', pms=pms, user_id=user_id)

sqlalchemy中我们可以使用with_entities来指定我们需要查询的列,通过any_value函数包裹content、c_time字段,然后通过c_time进行排序,最后通过GROUP BY进行分组,点击首页页面上插入数据按钮插入10条测试数据,然后点击查询私信按钮结果如下图

我们可以清楚地看到Sender ID出现顺序根据Send Time进行了逆序排列了,达到了最新发送的消息在最上方的目的,如下图(来自我的开源项目university-bbs私信模块,请忽略聊天文本,仅仅是为了测试所用二狗学院)

sqlalchemy中如果使用了with_entities来指定了查询列,查询结果不再是Object对象了,而是一个列表中嵌套了元祖的形式!

上面的代码只能在MySQL5.7以上的版本运行,因为MySQL5.7以下的版本没有ANY_VALUE这个函数,因此为了适配MySQL5.7以下的版本需要改进以下代码,如下代码段所示

@app.route('/insert/<user_id>')
def query(user_id):
    try:
        pms = PrivateMessage.query.\
            with_entities(PrivateMessage.sender_id,
                          func.any_value(PrivateMessage.content),
                          func.max(PrivateMessage.c_time)). \
            filter(PrivateMessage.receiver_id == user_id). \
            order_by(func.max(PrivateMessage.c_time).desc()). \
            group_by(PrivateMessage.sender_id).all()
    except Exception as e:
        pms = PrivateMessage.query.\
            with_entities(PrivateMessage.sender_id,
                          PrivateMessage.content,
                          func.max(PrivateMessage.c_time)).\
            filter(PrivateMessage.receiver_id == user_id).\
            order_by(func.max(PrivateMessage.c_time).desc()).\
            group_by(PrivateMessage.sender_id).all()
    return render_template('query.html', pms=pms, user_id=user_id)

很简单就是通过try来捕获异常,如果发生异常则使用except代码块里面的查询,上述示例代码存放于https://github.com/weijiang1994/blog-demo.git

2条回复    ·   2022-03-21 09:40:38

测试代码样式

@app.route('/insert/<user_id>')
def query(user_id):
    try:
        pms = PrivateMessage.query.\
            with_entities(PrivateMessage.sender_id,
                          func.any_value(PrivateMessage.content),
                          func.max(PrivateMessage.c_time)). \
            filter(PrivateMessage.receiver_id == user_id). \
            order_by(func.max(PrivateMessage.c_time).desc()). \
            group_by(PrivateMessage.sender_id).all()
    except Exception as e:
        pms = PrivateMessage.query.\
            with_entities(PrivateMessage.sender_id,
                          PrivateMessage.content,
                          func.max(PrivateMessage.c_time)).\
            filter(PrivateMessage.receiver_id == user_id).\
            order_by(func.max(PrivateMessage.c_time).desc()).\
            group_by(PrivateMessage.sender_id).all()
    return render_template('query.html', pms=pms, user_id=user_id)

您尚未登录, 登录注册 后评论