aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc
diff options
context:
space:
mode:
authorH.J. Lu <hongjiu.lu@intel.com>2007-05-22 14:37:19 +0000
committerH.J. Lu <hongjiu.lu@intel.com>2007-05-22 14:37:19 +0000
commit4128d31ef3b202ec3234e9dad572edb0c847d85a (patch)
tree2c9aaab1ef7877eb59417c50602fc058c38c90d6 /gcc/doc
parent975aa68b087e9b443fc7df8d60f8fd8a08393d20 (diff)
2007-05-22 H.J. Lu <hongjiu.lu@intel.com>
Richard Henderson <rth@redhat.com> * config.gcc (i[34567]86-*-*): Add smmintrin.h to extra_headers. (x86_64-*-*): Likewise. * i386/i386-modes.def (V2QI): New. * config/i386/i386.c (ix86_handle_option): Handle SSE4.1 and SSE4A. (override_options): Support SSE4.1. (IX86_BUILTIN_BLENDPD): New for SSE4.1. (IX86_BUILTIN_BLENDPS): Likewise. (IX86_BUILTIN_BLENDVPD): Likewise. (IX86_BUILTIN_BLENDVPS): Likewise. (IX86_BUILTIN_PBLENDVB128): Likewise. (IX86_BUILTIN_PBLENDW128): Likewise. (IX86_BUILTIN_DPPD): Likewise. (IX86_BUILTIN_DPPS): Likewise. (IX86_BUILTIN_INSERTPS128): Likewise. (IX86_BUILTIN_MOVNTDQA): Likewise. (IX86_BUILTIN_MPSADBW128): Likewise. (IX86_BUILTIN_PACKUSDW128): Likewise. (IX86_BUILTIN_PCMPEQQ): Likewise. (IX86_BUILTIN_PHMINPOSUW128): Likewise. (IX86_BUILTIN_PMAXSB128): Likewise. (IX86_BUILTIN_PMAXSD128): Likewise. (IX86_BUILTIN_PMAXUD128): Likewise. (IX86_BUILTIN_PMAXUW128): Likewise. (IX86_BUILTIN_PMINSB128): Likewise. (IX86_BUILTIN_PMINSD128): Likewise. (IX86_BUILTIN_PMINUD128): Likewise. (IX86_BUILTIN_PMINUW128): Likewise. (IX86_BUILTIN_PMOVSXBW128): Likewise. (IX86_BUILTIN_PMOVSXBD128): Likewise. (IX86_BUILTIN_PMOVSXBQ128): Likewise. (IX86_BUILTIN_PMOVSXWD128): Likewise. (IX86_BUILTIN_PMOVSXWQ128): Likewise. (IX86_BUILTIN_PMOVSXDQ128): Likewise. (IX86_BUILTIN_PMOVZXBW128): Likewise. (IX86_BUILTIN_PMOVZXBD128): Likewise. (IX86_BUILTIN_PMOVZXBQ128): Likewise. (IX86_BUILTIN_PMOVZXWD128): Likewise. (IX86_BUILTIN_PMOVZXWQ128): Likewise. (IX86_BUILTIN_PMOVZXDQ128): Likewise. (IX86_BUILTIN_PMULDQ128): Likewise. (IX86_BUILTIN_PMULLD128): Likewise. (IX86_BUILTIN_ROUNDPD): Likewise. (IX86_BUILTIN_ROUNDPS): Likewise. (IX86_BUILTIN_ROUNDSD): Likewise. (IX86_BUILTIN_ROUNDSS): Likewise. (IX86_BUILTIN_PTESTZ): Likewise. (IX86_BUILTIN_PTESTC): Likewise. (IX86_BUILTIN_PTESTNZC): Likewise. (IX86_BUILTIN_VEC_EXT_V16QI): Likewise. (IX86_BUILTIN_VEC_SET_V2DI): Likewise. (IX86_BUILTIN_VEC_SET_V4SF): Likewise. (IX86_BUILTIN_VEC_SET_V4SI): Likewise. (IX86_BUILTIN_VEC_SET_V16QI): Likewise. (bdesc_ptest): New. (bdesc_sse_3arg): Likewise. (bdesc_2arg): Likewise. (bdesc_1arg): Likewise. (ix86_init_mmx_sse_builtins): Support SSE4.1. Handle SSE builtins with 3 args. (ix86_expand_sse_4_operands_builtin): New. (ix86_expand_unop_builtin): Support 2 arg builtins with a constant smaller than 8 bits as the 2nd arg. (ix86_expand_sse_ptest): New. (ix86_expand_builtin): Support SSE4.1. Support 3 arg SSE builtins. (ix86_expand_vector_set): Support SSE4.1. (ix86_expand_vector_extract): Likewise. * config/i386/i386.h (TARGET_CPU_CPP_BUILTINS): Define __SSE4_1__ for -msse4.1. * config/i386/i386.md (UNSPEC_BLENDV): New for SSE4.1. (UNSPEC_INSERTPS): Likewise. (UNSPEC_DP): Likewise. (UNSPEC_MOVNTDQA): Likewise. (UNSPEC_MPSADBW): Likewise. (UNSPEC_PHMINPOSUW): Likewise. (UNSPEC_PTEST): Likewise. (UNSPEC_ROUNDP): Likewise. (UNSPEC_ROUNDS): Likewise. * config/i386/i386.opt (msse4.1): New for SSE4.1. * config/i386/predicates.md (const_pow2_1_to_2_operand): New. (const_pow2_1_to_32768_operand): Likewise. * config/i386/smmintrin.h: New. The SSE4.1 intrinsic header file. * config/i386/sse.md (*vec_setv4sf_sse4_1): New pattern for SSE4.1. (sse4_1_insertps): Likewise. (*sse4_1_extractps): Likewise. (sse4_1_ptest): Likewise. (sse4_1_mulv2siv2di3): Likewise. (*sse4_1_mulv4si3): Likewise. (*sse4_1_smax<mode>3): Likewise. (*sse4_1_umax<mode>3): Likewise. (*sse4_1_smin<mode>3): Likewise. (*sse4_1_umin<mode>3): Likewise. (sse4_1_eqv2di3): Likewise. (*sse4_1_pinsrb): Likewise. (*sse4_1_pinsrd): Likewise. (*sse4_1_pinsrq): Likewise. (*sse4_1_pextrb): Likewise. (*sse4_1_pextrb_memory): Likewise. (*sse4_1_pextrw_memory): Likewise. (*sse4_1_pextrq): Likewise. (sse4_1_blendpd): Likewise. (sse4_1_blendps): Likewise. (sse4_1_blendvpd): Likewise. (sse4_1_blendvps): Likewise. (sse4_1_dppd): Likewise. (sse4_1_dpps): Likewise. (sse4_1_movntdqa): Likewise. (sse4_1_mpsadbw): Likewise. (sse4_1_packusdw): Likewise. (sse4_1_pblendvb): Likewise. (sse4_1_pblendw): Likewise. (sse4_1_phminposuw): Likewise. (sse4_1_extendv8qiv8hi2): Likewise. (*sse4_1_extendv8qiv8hi2): Likewise. (sse4_1_extendv4qiv4si2): Likewise. (*sse4_1_extendv4qiv4si2): Likewise. (sse4_1_extendv2qiv2di2): Likewise. (*sse4_1_extendv2qiv2di2): Likewise. (sse4_1_extendv4hiv4si2): Likewise. (*sse4_1_extendv4hiv4si2): Likewise. (sse4_1_extendv2hiv2di2): Likewise. (*sse4_1_extendv2hiv2di2): Likewise. (sse4_1_extendv2siv2di2): Likewise. (*sse4_1_extendv2siv2di2): Likewise. (sse4_1_zero_extendv8qiv8hi2): Likewise. (*sse4_1_zero_extendv8qiv8hi2): Likewise. (sse4_1_zero_extendv4qiv4si2): Likewise. (*sse4_1_zero_extendv4qiv4si2): Likewise. (sse4_1_zero_extendv2qiv2di2): Likewise. (*sse4_1_zero_extendv2qiv2di2): Likewise. (sse4_1_zero_extendv4hiv4si2): Likewise. (*sse4_1_zero_extendv4hiv4si2): Likewise. (sse4_1_zero_extendv2hiv2di2): Likewise. (*sse4_1_zero_extendv2hiv2di2): Likewise. (sse4_1_zero_extendv2siv2di2): Likewise. (*sse4_1_zero_extendv2siv2di2): Likewise. (sse4_1_roundpd): Likewise. (sse4_1_roundps): Likewise. (sse4_1_roundsd): Likewise. (sse4_1_roundss): Likewise. (mulv4si3): Don't expand for SSE4.1. (smax<mode>3): Likewise. (umaxv4si3): Likewise. (uminv16qi3): Likewise. (umin<mode>3): Likewise. (umaxv8hi3): Rewrite. Only enabled for SSE4.1. * doc/extend.texi: Document SSE4.1 built-in functions. * doc/invoke.texi: Document -msse4.1. git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@124945 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/doc')
-rw-r--r--gcc/doc/extend.texi78
-rw-r--r--gcc/doc/invoke.texi8
2 files changed, 84 insertions, 2 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 2a310571b71..a09e4530977 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7396,6 +7396,84 @@ v4si __builtin_ia32_pabsd128 (v4si)
v8hi __builtin_ia32_pabsw128 (v8hi)
@end smallexample
+The following built-in functions are available when @option{-msse4.1} is
+used. All of them generate the machine instruction that is part of the
+name.
+
+@smallexample
+v2df __builtin_ia32_blendpd (v2df, v2df, const int)
+v4sf __builtin_ia32_blendps (v4sf, v4sf, const int)
+v2df __builtin_ia32_blendvpd (v2df, v2df, v2df)
+v4sf __builtin_ia32_blendvps (v4sf, v4sf, v4sf)
+v2df __builtin_ia32_dppd (__v2df, __v2df, const int)
+v4sf __builtin_ia32_dpps (v4sf, v4sf, const int)
+v4sf __builtin_ia32_insertps128 (v4sf, v4sf, const int)
+v2di __builtin_ia32_movntdqa (v2di *);
+v16qi __builtin_ia32_mpsadbw128 (v16qi, v16qi, const int)
+v8hi __builtin_ia32_packusdw128 (v4si, v4si)
+v16qi __builtin_ia32_pblendvb128 (v16qi, v16qi, v16qi)
+v8hi __builtin_ia32_pblendw128 (v8hi, v8hi, const int)
+v2di __builtin_ia32_pcmpeqq (v2di, v2di)
+v8hi __builtin_ia32_phminposuw128 (v8hi)
+v16qi __builtin_ia32_pmaxsb128 (v16qi, v16qi)
+v4si __builtin_ia32_pmaxsd128 (v4si, v4si)
+v4si __builtin_ia32_pmaxud128 (v4si, v4si)
+v8hi __builtin_ia32_pmaxuw128 (v8hi, v8hi)
+v16qi __builtin_ia32_pminsb128 (v16qi, v16qi)
+v4si __builtin_ia32_pminsd128 (v4si, v4si)
+v4si __builtin_ia32_pminud128 (v4si, v4si)
+v8hi __builtin_ia32_pminuw128 (v8hi, v8hi)
+v4si __builtin_ia32_pmovsxbd128 (v16qi)
+v2di __builtin_ia32_pmovsxbq128 (v16qi)
+v8hi __builtin_ia32_pmovsxbw128 (v16qi)
+v2di __builtin_ia32_pmovsxdq128 (v4si)
+v4si __builtin_ia32_pmovsxwd128 (v8hi)
+v2di __builtin_ia32_pmovsxwq128 (v8hi)
+v4si __builtin_ia32_pmovzxbd128 (v16qi)
+v2di __builtin_ia32_pmovzxbq128 (v16qi)
+v8hi __builtin_ia32_pmovzxbw128 (v16qi)
+v2di __builtin_ia32_pmovzxdq128 (v4si)
+v4si __builtin_ia32_pmovzxwd128 (v8hi)
+v2di __builtin_ia32_pmovzxwq128 (v8hi)
+v2di __builtin_ia32_pmuldq128 (v4si, v4si)
+v4si __builtin_ia32_pmulld128 (v4si, v4si)
+int __builtin_ia32_ptestc128 (v2di, v2di)
+int __builtin_ia32_ptestnzc128 (v2di, v2di)
+int __builtin_ia32_ptestz128 (v2di, v2di)
+v2df __builtin_ia32_roundpd (v2df, const int)
+v4sf __builtin_ia32_roundps (v4sf, const int)
+v2df __builtin_ia32_roundsd (v2df, v2df, const int)
+v4sf __builtin_ia32_roundss (v4sf, v4sf, const int)
+@end smallexample
+
+The following built-in functions are available when @option{-msse4.1} is
+used.
+
+@table @code
+@item v4sf __builtin_ia32_vec_set_v4sf (v4sf, float, const int)
+Generates the @code{insertps} machine instruction.
+@item int __builtin_ia32_vec_ext_v16qi (v16qi, const int)
+Generates the @code{pextrb} machine instruction.
+@item v16qi __builtin_ia32_vec_set_v16qi (v16qi, int, const int)
+Generates the @code{pinsrb} machine instruction.
+@item v4si __builtin_ia32_vec_set_v4si (v4si, int, const int)
+Generates the @code{pinsrd} machine instruction.
+@item v2di __builtin_ia32_vec_set_v2di (v2di, long long, const int)
+Generates the @code{pinsrq} machine instruction in 64bit mode.
+@end table
+
+The following built-in functions are changed to generate new SSE4.1
+instructions when @option{-msse4.1} is used.
+
+@table @code
+@item float __builtin_ia32_vec_ext_v4sf (v4sf, const int)
+Generates the @code{extractps} machine instruction.
+@item int __builtin_ia32_vec_ext_v4si (v4si, const int)
+Generates the @code{pextrd} machine instruction.
+@item long long __builtin_ia32_vec_ext_v2di (v2di, const int)
+Generates the @code{pextrq} machine instruction in 64bit mode.
+@end table
+
The following built-in functions are available when @option{-msse4a} is used.
@smallexample
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d8260ba120b..21ef96cae7c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -547,7 +547,8 @@ Objective-C and Objective-C++ Dialects}.
-mno-fp-ret-in-387 -msoft-float @gol
-mno-wide-multiply -mrtd -malign-double @gol
-mpreferred-stack-boundary=@var{num} -mcx16 -msahf @gol
--mmmx -msse -msse2 -msse3 -mssse3 -msse4a -m3dnow -mpopcnt -mabm @gol
+-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 @gol
+-msse4a -m3dnow -mpopcnt -mabm @gol
-mthreads -mno-align-stringops -minline-all-stringops @gol
-mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol
-m96bit-long-double -mregparm=@var{num} -msseregparm @gol
@@ -10260,6 +10261,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}.
@itemx -mno-sse3
@item -mssse3
@itemx -mno-ssse3
+@item -msse4.1
+@itemx -mno-sse4.1
@item -msse4a
@item -mno-sse4a
@item -m3dnow
@@ -10275,7 +10278,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}.
@opindex m3dnow
@opindex mno-3dnow
These switches enable or disable the use of instructions in the MMX,
-SSE, SSE2, SSE3, SSSE3, SSE4A, ABM or 3DNow! extended instruction sets.
+SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4A, ABM or 3DNow! extended
+instruction sets.
These extensions are also available as built-in functions: see
@ref{X86 Built-in Functions}, for details of the functions enabled and
disabled by these switches.